Many thanks to all who replied. This story has a happy ending. In a situation where an array loses power, and the server does not, each disk that the system attempts to read will fail. In a miror situation, it is possible, then, to have multiple read fails. For each failure that is not fatal (i.e. the server still thinks there's an available mirror or slice), it marks the disk in 'maintenance' mode. When a read is attempted from a the last available slice with a failed result, then that disk is placed into the 'last erred' mode. In a RAID 5 system, only 1 disk will enter the 'maintenance' mode. The next failed read places the failed disk into the 'last erred' mode, and the entire metadevice is taken offline. Further attempts at reads result in IO errors. As dersmythe pointed out, the Disksuite user manual makes reference a power failure like this. The procedure is to use the metareplace command with the -e switch on the disk in 'maintenance' mode. And example is: metareplace -e dx cxtxdxsx (replacing the x's with the metadevice and slice that have failed). It is important to use this command first on the 'maintenance' disk before attempting to enable the 'last erred' disk. I personally ran into a problem with this method: execution of metareplace -e failed and reported that I must use the -f (force) switch. Feeling uncomfortable with proceeding, I held off to do more research. A SECOND method of recovery is available as well: it is possible to CLEAR the metadevice with the metaclear command, and rebuild it, as long as the slices/disks are ordered just as they were prior to failure. The order is revealed with the metastat -p command. The manual recommends using this method only when the 1st method is unsuccessful. All metadevices (mirrors, raids, concats, etc) can be rebuilt in this fashion according to the manual. There are valuable documents on docs.sun.com as well as an article on sunsolve.sun.com referring to the second method of recovery. Sunsolve requires a subscription. As it happens, I employed neither of these procedures (unintentionally). In my case, a coincidental memory error caused a panic (I'm wondering if this was due to the original power outage as this machine is quite stable). Upon reboot, the metadevice was online and the 'last erred' disk was cleared. I used the metareplace -e command to clear the 'maintenance' disk and all was well. Thanks again to Prabir Sarkar, dersmythe, Michael T Pins, and Damian Wiest -dave ---------- Forwarded message ---------- From: David Graves <dsgraves@gmail.com> Date: Feb 21, 2006 10:11 PM Subject: D1000 power failure with Disksuite: how to restore to running state? To: sunmanagers@sunmanagers.org I have an Ultra 30 connected to a D1000 with 2 controllers (D1000 is split in 2, 6 disks to a controller). All the disks are configured as one Raid 5 metadevice. I experienced what appears to be a power glitch: enough to power down the D1000, but not powering down the server. This is my guess what happened next: The server tried to write to the first disk, and, being unable to, marked it 'maintenance' . The next disk write produced a 'last erred' error, and took the metadevice (Raid 5) offline. I powered back up the d1000 and it came back to life. My _guess_ is that the data is intact. A metadb shows the following: flags first blk block cnt Wm pc l 16 1034 /dev/dsk/c3t4d0s0 W pc l 16 1034 /dev/dsk/c4t1d0s0 a pc luo 16 1034 /dev/dsk/c0t1d0s7 a pc luo 1050 1034 /dev/dsk/c0d0s7 a pc luo 2084 1034 /dev/dsk/c0t1d0s7 and a metastat shows: c3t1d0s0 okay c3t3d0s0 Maintenance c3t4d0s0 okay cdt5d0s0 okay c4t0d0s0 okay c4t1d0s0 okay c4t3d0s0 last erred c4t4d0s0 okay QUESTION: While I have backups, they're older than I'd like them to be, and I have reason to believe that the only thing wrong here is that the D1000 powered down, and the data on the array is good. WHAT is the best way to attempt to recover? Is using the metainit -k command a safe way to proceed? Is that the best way to proceed? Will post all answers in summary. TIA -dave _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagersReceived on Mon Feb 27 22:20:48 2006
This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:56 EST