SUMMARY: patch policy

From: <kabinet_at_inf.u-szeged.hu> Date: Thu Mar 10 2005 - 02:42:05 EST · This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:44 EST

My question was how to avoid serios accidents on a production system 
while patching it? How should I apply recommended patches not to risk 
the usability of my system? What is your patch routine?

Mike Salehi writes:
===================
We do not apply the latest patch as it becomes available then do some 
research before the list is created.

Josh writes:
============
Software management at our location is done by replicating the 
production environment as completely as possible in what we call "the 
lab" :-) . Things such as clients, servers, software, configuration are 
mimicked as well as possible. We use it for testing patches, upgrades, 
new releases or even software that isn't currently in our production 
environment. We employ standardized Solaris builds and utilize flash 
archiving to rebuild our test systems in a very short period of time. 
This also helps to minimize differences between what's in production and 
what's in our lab. We also have cold spares for our biggest production 
servers which are always running a known good configuration.
Anyway, our steps are as follows:

1. Apply [patch | cluster | software | update] to lab system and mock up 
a load and generally work and monitor the system. The process we use
to test the patch is completely dictated by the server, the workload, 
the clients, the patch, the software, etc. We do have scripts that we 
run to confirm the servers functionality for our most commonly tested 
configurations.

2. Upon successful testing, we next schedule downtime on the production 
server and apply the software/patch.

3. Hopefully nothing severe happened. If it did, I mentioned we have a 
cold production spare that we can fail to in case of an emergency. We 
then try to figure out what went wrong on the production server.

Also, we are mostly a file-serving-type environment utilizing a SAN 
which makes failing over to a cold spare trivial. I'd imagine that using 
Sun's Live Upgrade facility could mimic our SAN scenario, which 
basically just guarantees a stable OS image.

So we have a few layers of fallback, which has helped tremendously. 
We've caught a few issues that would have done some damage had they 
notbeen tested. The system we employ also guarantees as little downtime 
for our clients as possible. This is obviously extremely important. I 
like to say it's an XP (eXtreme Programming) approach to systems testing 
and management: By utilizing lots of testing and working in small teams 
we have developed a pretty streamlined process.

Troy writes:
============
At a minimum you should run your patches for some sufficient period of 
time on a test system before rolling them into production.  Also it is 
good to have some way to comparre patches on test to production.  In a 
perfect world it is nice to have a "production lookalike" test level 
that has identical patches as  production that you patch after the fact.

In other words you might have these test levels:

1. Unit Test
2. System Test
3. Integration
4. UAT
5. Production Lookalike

The first 4 levels are patched to whatever the new production load 
requires and the 5 level is patched after the load to match production.

=========
Thanks for your replies!

Krisztian
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers