Hi, Sorry about the delay in summarizing, but wanted to test it out first and my first attempt to reply failed for some reason. Problem: Attempting to use DiskSuite 4.2.1 to mirror /, /usr, swap, etc on a 2 disk system. Put 3 state database replicas on each of the disks (total of 6). System remained up and running when simulated a disk failure, but was unable to reboot because only had exactly half of the database replicas available, not 1/2 + 1. Solution: (verified) Boot into single-user mode and use metadb -d _slice_ where slice is the location of the replicas on the failed disk. Use metadb -i if need to figure out what disk failed or where the replicas on that disk located. If the primary boot disk failed, may need to specify the bootable slice of the other disk in the boot command. The database replicas can be deleted even if the disk is failed. Upon reboot, you will now have all replicas available (the ones on the failed disk no longer count), so it will boot normally (again, may need to specify the alternate boot device). Upon replacing the failed drive, partition the new drive if needed, and use metadb -a to add the replicas on the new disk, and metareplace to synchronize the new slices with the valid filesystems. Of course, if one can catch the system before it reboots one can remove the replicas on the failed disk ahead of time and bypass the need to get into single user mode. An useful reference on the subject is: http://www.slacksite.com/solaris/disksuite/SDSrecovery.html Alternate solution: (not verified) It was also suggested that one could boot into single-user mode and edit the entries in vfstab to reflect physical slice names not diskSuite pseudo-device names. It was unclear whether the database replicas on the failed disk would need to be removed also. Basically, just unmirror everything and recreate the mirror when get a replacement disk. The general consensus (with which I agree having resolved the rebooting issue) is that mirroring is better than doing a periodic copy via cron, etc. There is some added complexity in rebooting after a disk failure, but this is more than made up for having up-to-date copies and the system staying up on a single disk failure. There was, however, one noticable hold out from the above opinion, who strongly prefers daily dd's via cron, arguing that way he always has a bootable hard disk. If I understand his argument, he is worried about human error or unexpected results of system maintenance damaging /, etc to the point that the system ceases to function, and using the daily backup disk to restore. However, that is only useful if the problem is discovered before the next backup, which might not be the case if the maintenance work was deemed "too minor" to have such a disastrous result (ie, the damage was done, but won't be noticed til next reboot). For more "dangerous" system work, the argument has merit, but then is probably best to use metadetach to temporarily disable mirroring until the work is done and the new system is deemed working, then re-attaching the mirrors. Many thanks to : Sean Berry, Darren Dunham, Rasal Kumerage, Gary Cook, J. D. Baldwin, John Phillips, Scott Kulp, Jonathon Andrews, Gabriel Rosenkoetter, Mike Tuupola, Tony Walsh, Jason Shackelford, and everyone else who replied. Tom Payerle Dept of Physics payerle@physics.umd.edu University of Maryland (301) 405-6973 College Park, MD 20742-4111 Fax: (301) 314-9525Received on Fri Aug 17 16:14:17 2001
This archive was generated by hypermail 2.1.8 : Wed Mar 23 2016 - 16:25:01 EDT