G'day folks, Firstly, here's a snippet from my original posting (a few weeks back now, but I wanted to wait and see what the final outcome was): >A client (and, as you might guess because of the local time I'm posting >this, a mate) is having severe problems with a Solaris 8 box, refusing to >boot. He's getting a whole series of (for example) >"/kernel/misc/sparcv9/md_raid: undefined symbol md_unit_incopen" errors >(the errors are reported for each of the forceload'd "md_*" modules, with >many symbols listed for each). The machine doesn't even successfully reach >single-user mode (the password prompt is displayed but the machine locks at >this point). > > ...[text describing attempts to recover deleted]... > >I don't have access to his boxes and I doubt I can solve his problem (I've >never seen anything like this, before) if I did. From what I have been told, >his /etc/system file contains forceloads only for (forgive the "shell short >cuts"): > misc/md_{hotspares,mirror,raid,sp,stripe,trans} > drv/{dad,isp,pci_pci,pcipsy,sd,simba,uata} > >The machine in question is an Ultra 10, with two internal IDE drives (one >of which appears to be severely dying, which is what led to all these >issues), >an internal CD-ROM, and 5 (or 6 - he couldn't tell me which) SCSI >drives (in >one of Sun's external enclosures). Well, the machine has now been running for a couple of weeks, without any mirrors. What's really strange is that the sub-mirror (the OS still insists on using a metadevice-based filesystem - and we're not willing to experiment any more) in use is on the drive that was generating "bad-block" messages (can you say "developing media faults"? <frown>). The organisation in question have (finally, remember this is an *OLD* Ultra 10) decided to replace the machine with a Linux setup. As for the RAID5 array, no recovery was required as the md replicas were fine. And, as was pointed out by Kev Smith, John Hudson and Dave Dunaway (thanks to each of you, BTW), the raid's configuration information may be available in md.tab (in this case, it was). If not, go looking for the setup docs for the nachine in question - they should contain the command line arguments passed to metainit (they did - I set this box up for them a few years ago <grin>). So, to sum up, the machine is running but the underlying problem was side- stepped rather than fixed. To "complete" this summary, let me say that we (this time I was directly involved) did try and convince the system that there was no SVM involvement by commenting out all of the meta-related information from /etc/system on one of the mirrors (the one we had already screwed up - we weren't willing to "break" the other one as well). All to no avail ... and, that is something I really wish I could explain. But, it got to the point where it was going to cost more for the "after-hours emergency support" than it would to simply replace the box in question. Again, thanks to the gentlemen mentioned above. Ciao. -------------------------------------------------------+--------------------- Daniel Baldoni BAppSc, PGradDipCompSci | Technical Director require 'std/disclaimer.pl' | LcdS Pty. Ltd. -------------------------------------------------------+ 856B Canning Hwy Phone/FAX: +61-8-9364-8171 | Applecross Mobile: 041-888-9794 | WA 6153 URL: http://www.lcds.com.au/ | Australia -------------------------------------------------------+--------------------- "Any time there's something so ridiculous that no rational systems programmer would even consider trying it, they send for me."; paraphrased from "King Of The Murgos" by David Eddings. (I'm not good, just crazy) _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagersReceived on Sun Oct 2 08:53:02 2005
This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:52 EST