Hi, I asked: > Yesterday I had a disk fail on a SunFire 280R Solaris 8. > The questions first: > Can the experts confirm my opinion that this is a hardware problem - > disk failure? Thanks to all those who gave help. (Roger Kynaston, Grant Lowe, Michael Grice, Brad Morrison and Abhijit Das). The consensus was that this was indeed a hardware disk failure - not the controller as this would have affected the other drive. This was confirmed using iostat which showed multiple hard errors and also by smartd which was unable to register the disk: Device: /dev/rdsk/c1t1d0s0, failed Test Unit Ready [err=-5], but did register the good one. smartd is part of the smartmontools package from SourceForge and, if I had had it installed, it might have given me advance warning of the failure. I also asked: > Have you any suggestions for recovering data from this drive (I do > have backups, but I would still lose some important data)? Suggestions were that I might be able to revive it temporarily by slapping it and/or putting it in the fridge for a couple of hours. Short of this forget it or the expensive data retrieval services. I can confirm that both methods have worked for me with PC disks in the past. Warning - don't try the fridge trick in a humid atmosphere or condensation can cause worse problems (if possible!). I tried both methods, but no luck. I have ordered a new drive and will recover what I can from backups. In addition I will probably install smartd on this and other servers. Thanks again Richard Butler Details of the original question: > > Symptoms: > This machine had two 72G disks (not mirrored) and during reboot after > installing the latest recommended patches I get the warning: > ... > Jun 14 12:06:30 ed pcisch: [ID 370704 kern.info] PCI-device: > SUNW,qlc@4, qlc0 > Jun 14 12:06:30 ed genunix: [ID 936769 kern.info] qlc0 is > /pci@8,600000/SUNW,qlc@4 > Jun 14 12:06:30 ed genunix: [ID 936769 kern.info] fp0 is > /pci@8,600000/SUNW,qlc@4/fp@0,0 > Jun 14 12:06:31 ed genunix: [ID 405830 kern.warning] WARNING: Device > ssd0 failed to power up. > Jun 14 12:06:32 ed genunix: [ID 749148 kern.warning] WARNING: Please > see your system administrator or reboot. > Jun 14 12:06:32 ed scsi: [ID 799468 kern.info] ssd0 at fp0: name > w21000004cf8e7591,0, bus address e8 > Jun 14 12:06:32 ed genunix: [ID 936769 kern.info] ssd0 is > /pci@8,600000/SUNW,qlc@4/fp@0,0/ssd@w21000004cf8e7591,0 > Jun 14 12:06:32 ed scsi: [ID 365881 kern.info] Vendor 'SEAGATE', > product 'ST373405FSUN72G', (unknown capacity) > Jun 14 12:06:32 ed genunix: [ID 408114 kern.info] > /pci@8,600000/SUNW,qlc@4/fp@0,0/ssd@w21000004cf8e7591,0 (ssd0) online > Jun 14 12:06:32 ed scsi: [ID 799468 kern.info] ssd1 at fp0: name > w21000004cf8e7555,0, bus address ef > Jun 14 12:06:32 ed genunix: [ID 936769 kern.info] ssd1 is > /pci@8,600000/SUNW,qlc@4/fp@0,0/ssd@w21000004cf8e7555,0 > Jun 14 12:06:32 ed scsi: [ID 365881 kern.info] <SUN72G cyl 14087 alt > 2 hd 24 sec 424> > Jun 14 12:06:32 ed genunix: [ID 408114 kern.info] > /pci@8,600000/SUNW,qlc@4/fp@0,0/ssd@w21000004cf8e7555,0 (ssd1) online > Jun 14 12:06:32 ed swapgeneric: [ID 308332 kern.info] root on > /pci@8,600000/SUNW,qlc@4/fp@0,0/disk@w21000004cf8e7555,0:a fstype ufs > ... > And of course all the filesystems on this disk failed to fsck or > mount. > > Using format I can see the bad disk as c1t1d0 (although searching > for disks... seems to take longer than normal) > 0. c1t0d0 <SUN72G cyl 14087 alt 2 hd 24 sec 424> > /pci@8,600000/SUNW,qlc@4/fp@0,0/ssd@w21000004cf8e7555,0 > 1. c1t1d0 <drive type unknown> > /pci@8,600000/SUNW,qlc@4/fp@0,0/ssd@w21000004cf8e7591,0 > > If I go ahead and give the correct geometry I can then see the > partition table as I had it before the crash. > Part Tag Flag Cylinders Size Blocks > 0 root wm 0 - 412 2.00GB (413/0/0) > 4202688 > 1 swap wu 413 - 1237 4.00GB (825/0/0) > 8395200 > 2 backup wm 0 - 14086 68.35GB (14087/0/0) > 143349312 > 3 unassigned wm 1238 - 1240 14.91MB (3/0/0) > 30528 > 4 var wm 1241 - 2065 4.00GB (825/0/0) > 8395200 > 5 unassigned wm 2066 - 6187 20.00GB (4122/0/0) > 41945472 > 6 usr wm 6188 - 8248 10.00GB (2061/0/0) > 20972736 > 7 home wm 8249 - 14086 28.33GB (5838/0/0) > 59407488 > > I cannot however mount any partition: > mount /dev/dsk/c1t1d0s5 /mnt > mount: I/O error > mount: cannot mount /dev/dsk/c1t1d0s5 > _______________________________________________ > sunmanagers mailing list > sunmanagers@sunmanagers.org > http://www.sunmanagers.org/mailman/listinfo/sunmanagers _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagersReceived on Mon Jun 18 13:06:18 2007
This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:44:06 EST