My question was how to avoid serios accidents on a production system while patching it? How should I apply recommended patches not to risk the usability of my system? What is your patch routine? Mike Salehi writes: =================== We do not apply the latest patch as it becomes available then do some research before the list is created. Josh writes: ============ Software management at our location is done by replicating the production environment as completely as possible in what we call "the lab" :-) . Things such as clients, servers, software, configuration are mimicked as well as possible. We use it for testing patches, upgrades, new releases or even software that isn't currently in our production environment. We employ standardized Solaris builds and utilize flash archiving to rebuild our test systems in a very short period of time. This also helps to minimize differences between what's in production and what's in our lab. We also have cold spares for our biggest production servers which are always running a known good configuration. Anyway, our steps are as follows: 1. Apply [patch | cluster | software | update] to lab system and mock up a load and generally work and monitor the system. The process we use to test the patch is completely dictated by the server, the workload, the clients, the patch, the software, etc. We do have scripts that we run to confirm the servers functionality for our most commonly tested configurations. 2. Upon successful testing, we next schedule downtime on the production server and apply the software/patch. 3. Hopefully nothing severe happened. If it did, I mentioned we have a cold production spare that we can fail to in case of an emergency. We then try to figure out what went wrong on the production server. Also, we are mostly a file-serving-type environment utilizing a SAN which makes failing over to a cold spare trivial. I'd imagine that using Sun's Live Upgrade facility could mimic our SAN scenario, which basically just guarantees a stable OS image. So we have a few layers of fallback, which has helped tremendously. We've caught a few issues that would have done some damage had they notbeen tested. The system we employ also guarantees as little downtime for our clients as possible. This is obviously extremely important. I like to say it's an XP (eXtreme Programming) approach to systems testing and management: By utilizing lots of testing and working in small teams we have developed a pretty streamlined process. Troy writes: ============ At a minimum you should run your patches for some sufficient period of time on a test system before rolling them into production. Also it is good to have some way to comparre patches on test to production. In a perfect world it is nice to have a "production lookalike" test level that has identical patches as production that you patch after the fact. In other words you might have these test levels: 1. Unit Test 2. System Test 3. Integration 4. UAT 5. Production Lookalike The first 4 levels are patched to whatever the new production load requires and the 5 level is patched after the load to match production. ========= Thanks for your replies! Krisztian _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagersReceived on Thu Mar 10 02:42:51 2005
This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:44 EST