Hi there. Got to the bottom of it and all responses were good. Replaced the power supply, got a bit healthier. The boot disc was bad so relied on the mirror. This wasn't as healthy as I'd hoped and so I took it out into a disk cage I had and fsck'd the relevent slices. Altered the vfstab etc and was able to boot off this disk. Back up and running Thanks again Patrick -----Original Message----- From: sunmanagers-bounces@sunmanagers.org [mailto:sunmanagers-bounces@sunmanagers.org] On Behalf Of Roe, Patrick Sent: 28 June 2006 09:16 To: sunmanagers@sunmanagers.org Subject: Help Needed, server down - supply rail 4 FATAL FAULT: failed -shutdown req'd Hi, i've tried to get this in the mailing list before but it's not appeared. Can you let me know how to get this on or even forward it on? i wonder if anyone can help me with a major server problem, at present the server does not boot up properly at all. I'd to reboot it a few times to get any info from ALOM, all below. lom>showlogs Eventlog: +29d+0h33m49s supply rail 4 FAULT: failed +29d+0h44m34s supply rail 4 FATAL FAULT: failed - shutdown req'd +29d+0h44m34s Fault LED 3Hz +29d+0h44m39s host power off +34d+1h51m41s Fault LED OFF +34d+1h56m12s host power on +34d+1h56m14s host FAULT: unexpected power off +34d+1h56m14s Fault LED ON +0h0m0s LOM booted +0h0m0s host power on lom>environment Fault OFF Alarm1 OFF Alarm2 OFF Alarm3 ON Fans: 1 fan1 OK speed 92% 2 fan2 OK speed 94% 3 cpu OK speed 100% 4 psu OK speed 100% PSUs: 1 OK Temperature sensors: 1 Enclosure 25degC OK Overheat sensors: 1 CPU OK Circuit breakers: 1 SCSI-Term OK 2 USB0 OK 3 USB1 OK 4 SCC OK Supply rails: 1 5V OK 2 3V3 OK 3 +12V OK 4 -12V OK 5 CPU core OK 6 +3VSB OK lom> lom> lom> lom>break Drive not ready lom> lom> lom> lom>break ------------------------------------------------- lom> lom> lom>environment Fault OFF Alarm1 OFF Alarm2 OFF Alarm3 ON Fans: 1 fan1 OK speed 96% 2 fan2 OK speed 97% 3 cpu OK speed 100% 4 psu OK speed 100% PSUs: 1 OK Temperature sensors: 1 Enclosure 26degC OK Overheat sensors: 1 CPU OK Circuit breakers: 1 SCSI-Term OK 2 USB0 OK 3 USB1 OK 4 SCC OK Supply rails: 1 5V OK 2 3V3 OK 3 +12V OK 4 -12V OK 5 CPU core OK 6 +3VSB OK lom>poweron lom>poweroff lom> LOM event: +0h8m55s host power off lom>bootmode reset_nvram lom>poweron lom> LOM event: +0h9m33s host power on Drive not ready Setting NVRAM parameters to default values. Sun Fire V120 (UltraSPARC-IIe 548MHz), No Keyboard OpenBoot 4.0, 512 MB memory installed, Serial #51813843. Ethernet address 0:3:ba:16:9d:d3, Host ID: 83169dd3. bla bla bla - got to initialising memory and stops ------------------------------------------------------ PBM Diag Reg Test CPU PBM Reg Test All Basic APB Simba Tests Init PCI All Basic RIO Tests (RIO# 1) RIO Ebus Config Space Reg Test RIO Network Config Space Reg Test RIO Firewire Config Space Reg Test RIO USB Config Space Reg Test All Basic RIO Tests (RIO# 2) RIO Ebus Config Space Reg Test RIO Network Config Space Reg Test RIO Firewire Config Space Reg Test RIO USB Config Space Reg Test All Basic SCSI Controller Tests Symbios SCSI Controller PCI Config Space Test All Basic SCSI Controller Tests Symbios SCSI Controller PCI Config Space Test Basic SouthBridge Tests Southbridge ISA Config Space Reg Test Southbridge PMU Config Space Reg Test Southbridge IDE Config Space Reg Test Southbridge Audio Config Space Reg Test All Memory Stress Tests Consist Write Data Test Resetting... Processor Speed = 548 MHz Baud rate is 9600 8 Data bits, 1 stop bits, no parity (configured from lom) Firmware CORE Sun Microsystems, Inc. @(#) core 1.0.12 2002/01/08 13:00 Software Power ON Verifying NVRAM...Done Bootmode is 0 [New I2C DIMM address] MCR0 = 37b2c206 MCR1 = 80008000 MCR2 = cf10000f MCR3 = a9000086 Ecache Size = 512 KB Clearing E$ Tags Done Clearing I/D TLBs Done Probing memory Done MEMBASE=0x0 MEMSIZE=0x20000000 Clearing memory...Done Turning ON MMUs Done Copy ROM to RAM (170040 bytes) Done Orig PC=0x1fff0007e44 New PC=0xf0f07e9c Processor Speed=548MHz Looking for Dropin FVM ... found Decompressing Client Done Transferring control to Client... ttya initialized Reset Control: BXIR:0 BPOR:0 SXIR:0 SPOR:1 POR:0 Probing upa at 1f,0 pci pci pci Probing upa at 0,0 SUNW,UltraSPARC-IIe SUNW,UltraSPARC-IIe (512 Kb) Loading Support Packages: kbd-translator Loading onboard drivers: ebus flashprom eeprom idprom SUNW,lomh Probing /pci@1,1 Device 3 pmu i2c temperature dimm i2c-nvram idprom motherboard-fru fan-control lomp Probing Memory Bank #0 512 Megabytes Probing Memory Bank #1 0 Megabytes Probing Memory Bank #2 0 Megabytes Probing Memory Bank #3 0 Megabytes Probing /pci@1,1 Device 7 isa power serial serial Probing /pci@1,1 Device c network firewire usb Probing /pci@1,1 Device 3 Probing /pci@1,1 Device d ide disk cdrom Probing /pci@1,1 Device 5 pci108e,1100 network firewire usb Probing /pci@1 Device 8 scsi disk tape scsi disk tape Probing /pci@1 Device 5 Nothing there Probing /pci@1 Device 6 Nothing there Probing /pci@1 Device 7 Nothing there Sun Fire V120 (UltraSPARC-IIe 548MHz), No Keyboard OpenBoot 4.0, 512 MB memory installed, Serial #51813843. Ethernet address 0:3:ba:16:9d:d3, Host ID: 83169dd3. Environment monitoring: disabled Boot device: net File and args: Using Onboard Transceiver - Link Up. Timeout waiting for ARP/RARP packet Timeout waiting for ARP/RARP packet Timeout waiting for ARP/RARP packet Timeout waiting for ARP/RARP packet Timeout waiting for ARP/RARP packet ok boot disk Boot device: /pci@1f,0/pci@1/scsi@8/disk@0,0 File and args +++++++++then just hangs forever+++++++++++ dodgy disk????????? ...............last time CPU PBM Reg Test All Basic APB Simba Tests Init PCI All Basic RIO Tests (RIO# 1) RIO Ebus Config Space Reg Test RIO Network Config Space Reg Test RIO Firewire Config Space Reg Test RIO USB Config Space Reg Test All Basic RIO Tests (RIO# 2) RIO Ebus Config Space Reg Test RIO Network Config Space Reg Test RIO Firewire Config Space Reg Test RIO USB Config Space Reg Test All Basic SCSI Controller Tests Symbios SCSI Controller PCI Config Space Test All Basic SCSI Controller Tests Symbios SCSI Controller PCI Config Space Test Basic SouthBridge Tests Southbridge ISA Config Space Reg Test Southbridge PMU Config Space Reg Test Southbridge IDE Config Space Reg Test Southbridge Audio Config Space Reg Test All Memory Stress Tests Consist Write Data Test Resetting... Processor Speed = 548 MHz Baud rate is 9600 8 Data bits, 1 stop bits, no parity (configured from lom) Firmware CORE Sun Microsystems, Inc. @(#) core 1.0.12 2002/01/08 13:00 Software Power ON Verifying NVRAM...Done Bootmode is 1 [New I2C DIMM address] MCR0 = 37b2c206 MCR1 = 80008000 MCR2 = cf10000f MCR3 = a9000086 Ecache Size = 512 KB Clearing E$ Tags Done Clearing I/D TLBs Done Probing memory Done MEMBASE=0x0 MEMSIZE=0x20000000 Clearing memory...Done Turning ON MMUs Done Copy ROM to RAM (170040 bytes) Done Orig PC=0x1fff0007e44 New PC=0xf0f07e9c Processor Speed=548MHz Looking for Dropin FVM ... found Decompressing Client Done Transferring control to Client... ttya initialized Reset Control: BXIR:0 BPOR:0 SXIR:0 SPOR:1 POR:0 Probing upa at 1f,0 pci pci pci Probing upa at 0,0 SUNW,UltraSPARC-IIe SUNW,UltraSPARC-IIe (512 Kb) Loading Support Packages: kbd-translator Loading onboard drivers: ebus flashprom eeprom idprom SUNW,lomh Probing /pci@1,1 Device 3 pmu i2c temperature dimm i2c-nvram idprom motherboard-fru fan-control lomp Probing Memory Bank #0 512 Megabytes Probing Memory Bank #1 0 Megabytes Probing Memory Bank #2 0 Megabytes Probing Memory Bank #3 0 Megabytes setting input and output device to ttya because of lom bootmode "-u". input-device = ttya output-device = ttya Probing /pci@1,1 Device 7 isa power serial serial Probing /pci@1,1 Device c network firewire usb Probing /pci@1,1 Device 3 Probing /pci@1,1 Device d ide disk cdrom Probing /pci@1,1 Device 5 pci108e,1100 network firewire usb Probing /pci@1 Device 8 scsi disk tape scsi disk tape Probing /pci@1 Device 5 Nothing there Probing /pci@1 Device 6 Nothing there Probing /pci@1 Device 7 Nothing there Sun Fire V120 (UltraSPARC-IIe 548MHz), No Keyboard OpenBoot 4.0, 512 MB memory installed, Serial #51813843. Ethernet address 0:3:ba:16:9d:d3, Host ID: 83169dd3. Environment monitoring: disabled Boot device: net File and args: Using Onboard Transceiver - Link Up. Timeout waiting for ARP/RARP packet Timeout waiting for ARP/RARP packet Timeout waiting for ARP/RARP packet ok boot disk Boot device: /pci@1f,0/pci@1/scsi@8/disk@0,0 File and args: ----------------------------------------------------------- at last got the error from before ........................... om>poweroff lom> LOM event: +0h50m12s host power off lom>poweron lom> LOM event: +0h50m58s host power on Power ON Verifying NVRAM...Done Bootmode is 0 [New I2C DIMM address] MCR0 = 37b2c206 MCR1 LOM event: +0h50m59s supply rail 1 FATAL FAULT: failed - shutdown req'd LOM event: +0h50m59s Fault LED 3Hz = 512 KB NVRAM Test Icache Test lom> LOM event: +0h51m4s host power off lom>...................................... ............................... . this time supply rail 1 fatal fault: failed shutdown required. i've seen some views that it may need a new board, disk, nvram chip, power supply or all of the above but was hoping to narrow that down, or get in a position to narrow it down. Any ideas greatly appreciated. Patrick _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagersReceived on Sat Jul 1 07:06:12 2006
This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:59 EST