Sorry about the delay in posting a sumamry. Thanks to those who replied. I should have mentioned that the filesystem was UFS, however I think the underlying problem is independent of the filesystem. Two respondents have had very similar experiences, both with LSI chipset RAID controllers, one with X4150s. Under heavy load, caused by either heavy I/O or cable/disk problems causing a lot of retrys, the controllers just give up. The solution is to ensure cables and disks are good, replacing disks when they start to show errors and not waiting for them to fail, and if necessary to add another RAID controller to spread the load. Original post below, thanks for all your help (oh and btw we're going to add a sleep in the loop to give everything a chance to catch its breath) John --original post -- We have 2 X4150 that act as NFS file servers. Each has 8Gb memory and 6 140Gb disks configured as RAID10 with an internal Intel SAS raid controller. OS is Solaris 10u8 Under "normal" operation everything seems fine, it supports ~100 attached NFS clients running eclipse. Last Friday the user space reached 100%, everything was OK until we tried to delete some expired user accounts. This is done with a script that, in effect, does: for i in 1 to 30; do /bin/rm -rf /export/home/user$i done When lightly loaded this script works fine, however on this occassion when the system was working fairly hard after the first couple of accounts had been deleted everything stopped. Disk lights stopped flashing, existing NFS connections stopped working and you could log in but never got a prompt. It required a power cycle to recover. This is all indictative of loosing connection to the disk in some form. There is nothing in any log to indicate a problem. Has anyone come across anything similar or like to guess what may be happening. -- John Landamore Department of Computer Science University of Leicester University Road, LEICESTER, LE1 7RH J.Landamore@mcs.le.ac.uk Phone: +44 (0)116 2523410 Fax: +44 (0)116 2523604 _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagersReceived on Fri Nov 19 11:16:36 2010
This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:44:17 EST