Well I still have not isolated the cause of the crash. Thanks though to the following for responding: Kevin Buterbaugh: advised to be sure that latest kernel version (108528-07) was installed (it was). Joseph Herpers: informed me of "Sleuth"; proprietary software that may help detect intermittent crashes and problems and especially "Sun Man" Andy Townsend who warned that some Ultra 5's shipped with bad CPU's (serial #'s start with FW04011x) in late 2000. My system was obtained earlier in 2000 and has a different serial # sequence. Since each crash has occurred while I was running IP Filters (3.4.10 up through current 3.4.17), with my state table filled to capacity, I suspect that may have contributed to the crash. I have made some changes that will prevent the state table from filling up. I have also recompiled IP Filter with a new version of gcc that was built on a Solaris 8 platform. Previous versions were built with a 2.6-built gcc. I have read, though, that versions of gcc 2.8 and up no longer need to be rebuilt for a new OS version. I plan to "stress test" the CPU by running some complex numerical models on it next. --Kevin ______________________________________________________________________ Kevin Tyle, Systems Administrator ********************** Dept. of Earth & Atmospheric Sciences ktyle@atmos.albany.edu University at Albany, ES-235 518-442-4571 (voice) 1400 Washington Avenue 518-442-5825 (fax) Albany, NY 12222 ********************** ______________________________________________________________________ Original Post: Hi all, I am trying to isolate the cause of periodic crashes on one of our Ultra 5 machines (333 MHz). The latest instance: May 9 11:04:23 fire ^Mpanic[cpu0]/thread=4005fe60: May 9 11:04:23 fire unix: [ID 340138 kern.notice] BAD TRAP: type=31 rp=4005fab8 addr=10 mmu_fsr=0 occurred in module "arp" due to a NULL pointer dereference May 9 11:04:23 fire unix: [ID 100000 kern.notice] May 9 11:04:23 fire unix: [ID 839527 kern.notice] sched: May 9 11:04:23 fire unix: [ID 520581 kern.notice] trap type = 0x31 May 9 11:04:23 fire unix: [ID 381800 kern.notice] addr=0x10 Other instances have had the same "BAD TRAP type" although the module is not always "arp"--it has been "ipf" too (we are running IP Filter 3.4.17). The machine is only in testing and it is only serving a couple of machines as a firewall. It has two NIC's. Analysis of the crash dump with adb is not too helpful for me at least: echo '$c' | adb -k unix.3 vmcore.3 physmem 79fa panicsys(0x104166e0,0x4005f900,0x1004e48c,0x70002000,0x0,0x10410db8) + 44 vpanic(0x1004e48c,0x4005f900,0x23,0x8,0x8,0x8) + cc panic(0x1004e48c,0x31,0x4005fab8,0x10,0x0,0x703f1c80) + 1c die(0x1040c9ac,0x4005fab8,0x10,0x0,0x4005fab8,0x31) + 80 trap(0x0,0x1,0x0,0x104169f0,0x5,0x0) + 8a0 sfmmu_tsb_miss(0x1041b214,0x0,0x0,0x70057fb8,0x70057fb8,0x0) + 5fc prom_rtt(0x10,0x0,0x0,0x7007e260,0x704f994c,0x0) ar_ce_walk(0x1045e464,0x1045e200,0x1023b900,0x0,0x0,0x104118c8) + 3c ar_wsrv(0x1045e000,0x7007e260,0x1045da30,0x704f994c,0x0,0x70465ce0) + 88 runservice(0x2000,0x2200,0x20000,0x1043b43c,0x704f9980,0x704f994c) + 3c background(0x704f994c,0x0,0x10000,0x104169f0,0x0,0x0) + d4 thread_start(0x0,0x0,0x0,0x0,0x0,0x0) + 4 Is there any way to tell if this is a hardware problem or a problem with IP Filter? Thanks.Received on Wed May 16 15:06:24 2001
This archive was generated by hypermail 2.1.8 : Wed Mar 23 2016 - 16:24:55 EDT