Original messages below. My problem has been solved by an unbelievable effort on the part of Sun. The root of the problem is that either during transport or upon powering up the A1000 after moving it the controller blew and spewed some data onto the disks, corrupting the RDAC information. I spent many hours on the phone with Sun and every attempt to revive the LUN failed. In the end I met with three very competent SSEs and they took the A1000 to their offices. There they rounded up a team of engineers and were able to modify the firmware in the A1000 so that recreating the LUNs did not overwrite the data with 0s. While I lost my RAID-0 LUN1, Sun was able to recreate the entire RAID-5 LUN0 and I lost no data! While my data was visible to Solaris, RM6.22 did not see the array. To fix this problem I followed this solution provided by Sun: To fix this problem you will need to remove the rdac logical devices (c#t#d#) as seen by Solaris and Raid Manager in order to recreate the logical device controller #s. This procedure can also be used to sync controller #'s between format and lad if c#'s don't match To perform the procedure for syncing up c#'s in lad and format with RM6.2.2, and replacing c#'s back to an acceptable value : cd /dev/dsk rm c#'s for A1000 devices cd /dev/rdsk rm c#'s for A1000 devices cd /dev/osa/dev/dsk rm c#'s for A1000 devices cd /dev/osa/dev/rdsk rm c#'s for A1000 devices (Run the following rdac_disks to remove all rdac devices from format) /usr/lib/osa/bin/rdac_disks (Run the following hot_add to recreate proper rdac device controller #s for all of the following: format, lad, /dev/(r)dsk /dev/osa/dev/(r)dsk instantly with no need to reboot or boot -r) /usr/lib/osa/bin/hot_add Note: It is also possible that after a "boot -r", the rdac devices MIGHT NOT show up in format at all. Simply follow the same guidelines as above, to recreate the rdac devices and sync up Solaris with Raid Manager. While tempting, do not try to run devfsadm to create links in place of hot_add, because it will create a Solaris path such as /sbus@3,0/QLGC,isp@3... as opposed to the correct /pseudo/rdnexus@2,0..path that is required for the device to be properly addressed. Pravin Nair sent me a similar procedure which requires a 'boot -r' to correct. The hot_add command is a great way to avoid rebooting. I would like to thank Pravin Nair, Jed Dobson, Christian Nicca,Tom Chipman, Tony Walsh, and Patricio Mora for their suggestions. --------------------------- Part 1: Gurus, I have been charged with moving cages within our colo provider this weekend. I am having problems bringing up my A1000 after moving it. The A1000 is a 12 bay model with 10 36.4GB disks and 2 18GB disks installed. It is connected via SCSI to an E220R. When the device is powered up now, the LEDs all light up correctly (all 12 are green), but the 4 LEDs (0-3,0-4,1-3,1-4) switch to amber and the service LED turns on. Coincidentally (I really hope), those four disks were installed in the array Thrusday morning, and the array was rebuilt shortly after. I have been able to copy my data from my backup to the new array and have been running for more than a day prior to the move. The array contains a 10 disk RAID-5 array and a 2 disk RAID-0 array. arraymon claims that no RAID modules were found, and RaidManager 6.22 reports that the controller has failed. I am assuming that this is referring to the RAID module, and it would make sense becuase probe-scsi-all returns only the two internal disks and DVD drive. Can anyone shed some light on the subject? Perhaps an explanation for why those four LEDs are on (is it a POST message? I am investigating that now). I really need to get either of the RAID devices running. This is rather urgent because of the nature of the problem. Thanks, and I will summarize. Part 2: Gurus, It turns out that my A1000 controller had failed. Sun replaced the controller and Solaris 8 is seeing the array. The new problem, and a very scary one, is that my LUNs are not configured/identified correctly. I had two LUNs in the array. LUN 0 is a RAID5 device with 9 disks. LUN1 is a RAID0 device with two disks. Now, RaidManager is reporting that LUN0 is dead, and does not even see LUN1. The two disks in LUN1 are recognized as unassigned and optimal. Is there a way to force RaidManager to reload the configuration of LUN1? Rebuilding will overwrite the data, correct? Ahh! Murhpy strikes again. I would appreciate any help possible. I will combine the two messages into one summary. Thanks,Received on Tue Nov 27 21:58:42 2001
This archive was generated by hypermail 2.1.8 : Wed Mar 23 2016 - 16:32:36 EDT