I received much useful advice and information from Peter Bauer peter.bauer(a)itserv.de. I learned from Peter: 1) The following can be used to find the cause of the error in the messages log: grep -n maint /var/adm/messages* Just a few lines ahead of this should be the SCSI warning associated with the disk error. If you have the chance, go back in time by looking at /var/adm/messages* and look for a message like "WARNING: write error...". If your metadisk had an error during a "read", your data is OK. If it was a write error, the original data was not successfully written to disk, so you have faulty data. Run a database consistency check. 2) Create a new submirror with metainit d28 1 1 <newdisk> metattach d8 d28 DON'T USE metattach -f (at this time). The disks should automatically synchronize [which they did in my case]. After the resync is done, you will have a so-called "two-way mirror". 3) Detach the defective mirror half (using metadetach dX dY) or force the resync of the last-erred disk (metareplace -e c1t2d3s4). [I detached the defective mirror.] 4) If possible, run an fsck on the metadisk after detaching the submirror in the last-erred state. This is so that if there was an error copied over to the new sub-mirror, fsck will find it. You might use format to do an analyse->refresh and see if your disk still works, but it would be better to replace the disk. It's usually cheaper than having unintended data modification or loss of service. 5) Make sure you have enough replica/state databases. They should be on at least three diffrent disks. 6) Some of the error messages in /var/adm/messages* indicated a problem with the SCSI bus. Since all problems occured on the same disk, it _might_ be a problem with the cabelling. Check that all cables are fitted and secured - if possible, and it also might be a good idea to remove the cables to that disk and re-install them. If the disk is in an enclosure (UniPack, MultiPack), you might want to open it, remove the disk and re-install it. It could be a non-perfect electrical connection. Marian Russell -----Original Message----- I inherited responsibility for a SUNBlade100 with Solaris 8 OS about a year ago. Someone had already set up mirroring for the / (d0), /var (d4), and /export (d7) filesystems and, additionally, had created names for the two large disks (d8 & d9, each RAID 1), but instead of making two sub-mirrors for each of these, each has only one sub-mirror (d18 & d19, each RAID 0) that is basically the size of the entire disk. In the course of adding a new PCI SCSI card and disk pack to our system, I set up the two new disks in the same manner as d8 and d9, and noticed that d8 is showing Needs Maintenance and d18 is showing Last Erred. This morning I took the system down to single user mode, unmounted d8 and did an fsck. There were no errors. Then I ran metastat again and the status was still the same. Then I tried metareplace -e d8 /dev/dsk/c1t1d0s7 and got the error: attempt to replace a component on the last running submirror So I have a lot of questions (if you can answer even one of these, I would be most grateful!): 1. Is the setup of the disks with only one sub-mirror component okay with the top level being RAID 1 and the one sub-component RAID 0? 2. Since the two new disks have exactly the same configuration as the one with the errors, could I set up a second sub-mirror temporarily that is one of the new disks and somehow make it a mirror of d8 while I fix d8 somehow? 3. How do I fix d8 (this is like our main applications system disk and we need it!) ? 4. Why doesn't fsck find any problems with it? 5. Is this like a critical situation that needs to be remedied asap? It could be that the system has been in the state for some time (like weeks or even months), because I didn't know that it was something I should be looking at on a regular basis until now. Thanks in advance for any help, advice, info you have time to provide! Marian Russell This email message is intended only for the use of the named recipient. Information contained in this email message and its attachments may be privileged, confidential and protected from disclosure. If you are not the intended recipient, please do not read, copy, use or disclose this communication to others. Also please notify the sender by replying to this message and then delete it from your system. _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagersReceived on Wed Jun 2 05:06:39 2004
This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:32 EST