Dear managers, Thank you to Ashish, Steve, Glenn (sorry if I missed anyone) for pointing that it's most likely a faulty CPU. I did have a faulty CPU, which results in the machine not booting at all. I initially thought that I could use the diagnostics given in http://lios.apana.org.au/~cdewick/sunshack/data/sh/2.0/infoserver.central/cgi-bin/doc2html2786-2.html?intsrdb/21220 to check which CPU is faulty, but the machine never did get to OBP when switched on, which made me falsely think it was a system board problem (Sun suggested it was most likely that). So really I should have removed the CPUs on by one to find out which one was not letting the e450 boot (some say faulty memory might also cause the RED state exception but perhaps not as bad as this). One more thing I was confused about. In http://sunsolve.sun.com/handbook_pub/Devices/CPU_Module/UltraSPARC_480MHz_UltraII.html, it says that an empty CPU slot `requires' a filler. This doesn't seem to matter as I got the machine running fine with the empty slot. Thanks heaps:) Some good advice from people which I find also useful for general diagnostics: -- 1.try to run ur server in minimal configuration i.e. 4 rams (1 bank) & 1 cpu...while doing this try to see which banks & which cpu slots have to be filled up....i think u might have removed the cpu slot which has to be filled thats why no display {no requirement for the system to be filled with 4 cpus } ...you can get all info on the default locations for rams & cpus from "docs.sun.com" --> e450 service manual. 2. for "red state" exception error try to remove all ur non essential cards & then see whether u r getting the error....also at OK prompt give "test-all" 3.try connecting a console cable to serial port A with "diag-level=max" ;"diag-switch?=true";"output-device=ttya" ...also put "auto-boot?=false" if u dont want the system to automatically boot after the tests. ashish n -- Sounds like you likely have one of many different problems. The Red state exception is almost assuredly a CPU problem. Not likely a Mainboard. When you remove the system down to less than 4 CPU's you have to make sure that you put them in the proper locations. They are marked on the main board indicating which slot must be populated first, second, etc.. They MUST go in that order. You may also have a DC to DC Converter. If you look at the mainboard, you will see 4 small boards with capacitors on them directly underneath the memory slots. (Which incidentally also has to be inserted in a specific order) Here is what I would attempt to bring the machine back to life. First, remove all memory, CPUs, and DC converters. Install 1 bank of memory, 1 CPU and preferably a different one of the DC converters in the appropriate slots. Attempt to bring the machine up with that configuration. If that does not work, try swapping the CPU with one of the other ones, then try again. If nothing still, swap the DC Converter. Keep this pace up trying to boot the machine after each and every change. Once you have the machine booting again slowly begin populating the remainder of the memory, CPUs, and DC converters. Boot the machine between each of the adds. I would probably install all of the memory after the machine booted again, then boot it. This is not likely a memory issue anyway, but you want to be certain that it is not causing issues for some strange reason. You will most likely come across one of the CPUs that will hose the system during the boot, this is your culprit. Once you have the machine up and running again (If in the unlikely chance you have no additional errors) install SUNvts on the machine, and let it walk through the memory and cpus for errors. It is possible, but unlikely that it will find anything, but you want to be certain especially if this is a production machine. Glenn May -- Original post: Dear managers, When our enterprise 450 produced RED state exception errors (throughout, even on reboot), we purchased another motherboard (for 4 480 MHz CPUs with 8MB cache), after Sun's advice. Lo and behold, the problem still exists (saw it once, on the first boot):( So I'm afraid it's one (or more) of the CPUs causing problems (really should have guessed that in the first place). However, when I removed a CPU or two, (in fact, now after placing everything back the current problem is the same - ), nothing comes up on the console or monitor. In fact, I only get a green light on the status LED indicating power (only), no POST or general activity at all. Does this happen for < 4 CPUs when you do not place a filler in the empty CPU slots? Or am I doing something really wrong? Perhaps it's a loose connection somewhere but I'm not sure where to look. All help is appreciated, and I will summarize ________________________________________________ Message sent using Dodo Internet Webmail Server _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagersReceived on Tue Mar 16 21:41:32 2004
This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:29 EST