(for the Boeing folk, this is * not * the root cause analysis report, it's the mailing list summary for sunmanagers ...) I received a range of answers, mostly falling into one of two categories: - never heard of such a thing happening, /usr/bin/compress can and should work OK - yup, I had it happen too, and switching to gtar/gzip was the cure For the record, we always have the databases shut down properly before beginning the copy and compress step (a precaution noted by several folks, thanks for the thought!) Also, for the record, we already thought of whether it could be Solaris 2.6 (we are already on the 105181-26 Recommended Patches cluster) or Veritas VxVM (3.1.1 plus patches) or VxFS (3.4 plus patches). Nope. Finally, for the record, after fixing a nagging problem with a gigabit ethernet switch, rebooting the E10K domain and power-cycling the disk arrays, the problem has not reoccurred. (Of course, until the next time, whenever that is ... :-) Nevertheless, we intend to continue work on improving the integrity of our disk-based database backups. Our alternatives are: - do nothing, rebooting the domain and power-cycling the disk arrays made the problem go away - switch out compress for gzip, leave everything else identical - use gtar instead of tar and add the z argument for compression, thus reversing the tar and the compress stages (compressed archive instead of an archive of compressed files) - use vxdump with a lot of mt commands to forward-space the tape for each mount point (ugh) - just buy a lot (!) more disk, upgrade the backup network interfaces to gigabit ethernet and stop using compress There is an issue with gtar 1.12 (the one we currently have deployed on our couple of hundred servers) and sparse files largefile filesystems. It just plain fails. The ChangeLog on 1.13 indicates that this is fixed. I'm getting that upgraded to 1.13, after which we will test. I'll do my best to provide a follow-up after the results of those tests are available. ttfn mm mark.michael@es.bss.boeing.com > Anyone have any experience with large (Oracle) datafiles being corrupted on > uncompress after being compress'ed? How did you solve the problem, short of > buying more disk for on-line (hot) backup storage in uncompressed format? > > Urgent. We are trying to transfer compressed data via tape. It worked on this > server's predecessor, which doesn't have the 2.6 patch cluster containing > 105181-26. There is a patch 107786-02 that replaced /usr/bin/compress, > /usr/bin/uncompress and /usr/bin/zcat (they're hardlinked anyways). > > Both old and new servers are E10K domain. > > I will summarize. > I've had this problem before. I moved all of my scripts to gzip and I've > had no problems since. > > Thanks. > > bfg > > Ive never heard of anyone having the problem. Compressing should not corrupt > anything. > > > > > Sorry to be the bearer of bad news, but...... > > > > Oracle Corp. specifically states to never compress a .dbf file. They say they can not > > guarantee consistency in any file that is compressed and then uncompressed, and if you > > do so it is at your own risk. I've done it a few times and managed to get by > > (luckily!), but several senior level DBA's have told me I've been extremely lucky and > > to not push it or I'm gonna "get bit". While they are sometimes considered to be > > alarmists, they still tell me everyone who compresses Oracle .dbf files eventually > > runs into this problem (often when its too late to do anything about it, > > unfortunately). > > > > Have a Great Day!!! > > > > > > --- Vern Walls > > > > > > Today's Words of Wisdom: > > ======================= > > Windows 98: n. > > A sometimes useless extension to a minor patch release > > for 32-bit extensions and a graphical shell for a > > 16-bit patch to an 8-bit operating system originally > > coded for a 4-bit microprocessor, written by a 2-bit > > company that can't stand for 1 bit of competition. > > > > I got that from both our on-staff senior level DBA's as well as directly from Oracle > Support (we have a Silver contract) > > Have a Great Day!!! > > > --- Vern > > > If Oracle is running you will always get corruption. > The datafile changes continuously. Try a export and > compress the export. Export is done inside oracle. > > -Mike > > > your best bet is to do an export of the database and then an import. > compress has issues with 'empty' spaces in data. > > On a side note.. you may want to use gnu tar first, then compress. > I haven't seen a summary on this, so I will throw my hat in. > > Yes, I have had a gzip problem with large ( >4Gb) database files. When > uncompressing the files, I always get CRC errors and it fails. But not > actually using "compress" to compress files. This was under Tru64. (old > Digital Unix) > > I have routinely compress/uncompressed 8Gb files and larger. Of course > the speed and compression factor wasn't as good as gzip, but at least I > could uncompress the file. > > For the past year and some, I have been working strictly in a Solaris > shop, and haven't had to compress large files like this. If you can get > the gzip source code, I would try to compile it with 64bit support. I > think the standard gzip binary that everyone tends to use is 32bit > compliant, and may not handle the large files well. > > -- > ---------------------------------------------------- > Andrew Stueve | Office 703-758-5221 > Team Lead/Sr. Engineer | Mobile 703-898-8917 > Worldcom | Pager 1-888-454-7594 > ---------------------------------------------------- > We haven't seen this directly, but I bet if you started the compression > while the DB was running (and therefore the file is not guaranteed to be > quiescent), you could easily have problems when uncompressing; something > changing in the "front" of the file after gzip/compress/whatever has > already read past that section. > > -- > Karl Vogel <vogelke@dnaco.net> > ASC/YCOA, Wright-Patterson AFB, OH 45433, USA > -- mark michael enterprise computing unix, info svcs, boeing satellite systems e-mail mark.o.michael@boeing.com ph 310 364 6759 fax 310 364 5331 snail-mail po box 92919 m/s sc s50 x340 los angeles ca 90009-2919 usaReceived on Tue Aug 14 20:59:04 2001
This archive was generated by hypermail 2.1.8 : Wed Mar 23 2016 - 16:25:01 EDT