SUMMARY: Data errors on disk is it End Of Life for this drive

From: Frank P. Bresz (fpb@ittc.wec.com)
Date: Sat Oct 26 1991 - 13:07:46 CDT


o Well sun-managers comes through again. I received 3 replies before I got
  my own copy of the mail I sent.

o After a reformat my disk is back to life, judging from comments it's not
  clear for how long but it is back. This incident has spurred new
  interest in spare parts by management.

o I was able to get a full backup even though I was getting 'bread' errors
  during dump (It took a real long time to do this).

o I was considering some repairing but 'rwolf@dretor.dciem.dnd.ca (Robert J
  Wolf)' suggested against it and in fact there doesn't appear to have been
  a need for it.

o Since the Reformat of the disk I haven't had any problems at all with the
  drive. Some people suggested checking cabling. I was reasonably sure
  that wasn't it as there are 4 disks on one controller and only 1 was
  having problems. Had the cable been jarred (it was disk 0) I suspect the
  errors would have propogated to the other drives as well.

o On a psychotic note. I upgraded to 4.1.1 since the system was down
  anyway. I really ought to have my head examined for that decision.

o Another note to all sun managers out there. If you are currently using
  Exabytes to do these nice big backups make sure you have your Exabyte
  targeted where a Generic Sun Kernel can find it. This was a major pain
  as I had to get enough of a system up to the point where I could build a
  new kernel before I could even start reloading from my dump tapes. This
  was part of my decision process for 4.1.1 upgrade as I had to load a lot
  anyway even had I stayed 4.1. (The statement above still holds though).

o I think my plan will be to buy 1 hot spare ~$8000 with 5 year warranty
  and have it around, then if one dies I can send it back to be rebuilt.
  curt@ecn.purdue.edu (Curt Freeland) said: "When I sent my dead drive to
  Seagate for repairs, they replaced the electronics, and HDA. I basically
  got a new drive for the price of a $2k rebuild."

o Of course when I finally got my disk back up at 0430 the day after it
  crashed (10 hours after I started the level 0 dump mentioned above) I had
  Stale NFS File Handles all over the network (I think I still have some
  residual effect from this). This caused extreme whining from my user
  types.

  :-) However having not shaved for the past 3 days and being sufficiently
  :-) grumpy and tired makes people not want to talk to me anyway.

o On another note I noticed that now all my partitions on my id000 are <
  0.2% fragmentation whereas some of my other drives are > 2.0%
  fragmentation. Anyone have any thoughts on how often a level 0 dump,
  reformat, reload, could be useful?

Thanks to all who helped :
Ken Rossman <ken@shibuya.cc.columbia.edu>
alexl@daemon.cna.tek.com (alex;923-4483)
beauchem@DMI.USherb.CA (Denis Beauchemin)
curt@ecn.purdue.edu (Curt Freeland)
evans@c4west.eds.com (Bill Evans)
keith@odi.com
kevins@Aus.Sun.COM (Kevin Sheehan {Consulting Poster Child})
lars@cmc.com (Lars Poulsen)
liz@isis.cgd.ucar.EDU (Liz Coolbaugh)
mcostel@kaman.com (Mark Costello)
peb@ueci.com (Paul Begley)
rr6204 <rr6204@moses.boeing.com>
rwolf@dretor.dciem.dnd.ca (Robert J Wolf)
selig@centaur.msfc.nasa.gov (Bill Selig - SysAdmin)

Frank P. Bresz |PCD Simulators Department, Westinghouse Electric Corporation
fpb@ittc.wec.com|My opinions are mine, WEC pays big money for official opinions
uunet!ittc!fpb |1 problem with X is that many X functions have more
+1 412 733 6749 |arguments than an Ivy-League Debating Team.



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:06:17 CDT