Well, the problem is not resolved, but I'll summarize anyway. This is a panic related to mutual exclusion locks (mutexes) which are common locking devices used in kernel code (i.e. the kernel and any drivers/kernel modules loaded into it). Several people suggested looking at the stack traceback cd /var/crash/`hostname` adb -k ./unix.x ./vmcore.x - where x= number of the dump created after the above panic $c - a stack traceback should appear> control-d to exit. Unfortunately in our case the machine hung hard so no crash dumps were generated. The traceback on the console was apparently also incomplete. Bad luck... Justin.Stringfellow@Sun.COM suggested putting: set snooping=1 in your /etc/system. This enables a timer ("the deadman timer") in the kernel which, if you have a hard hung kernel, _may_ allow the kernel to drop itself out to an OK prompt, where you can then type 'sync' and get a crash dump. No guarantees though. We'll try this and see what happens. He also suggested trying to disconnect the keyboard to see if that might get me an OK prompt. Some people suggested this could be related to a CPU hardware problem, or perhaps a software problem. I guess we still have some detective work - so far we have been unable to recreate the problem on demand... Thanks for your help and suggestions David > > Hi, > We have an Ultra 10 running Latest Sol 8 patches. > It occasionally panics with error: > > > panic[cpu0]/thread=40037e60 recursive mutex_error Lp=70357f40 owner=40037e60 > 40037a78 unix:mutex_vector_error+208 (0, 0, 20, 1040d45c, 104169e8, 70357f40) > > Only one of our Ultra 10's seems to be effected by this. > > Sunsolve search turned up nothing useful. > > Any Ideas what the problem might be and how to fix > diagnose it? > > Thanks > David >Received on Fri Jul 6 14:27:05 2001
This archive was generated by hypermail 2.1.8 : Wed Mar 23 2016 - 16:24:58 EDT