Summary: memory or CPU error on SunFire V280R

From: Gene Beaird <bgbeaird_at_sbcglobal.net>
Date: Fri May 23 2008 - 14:16:54 EDT
Thanks to:

mh1272 (hike)
Eric Voisard
Dean Ross-Smith for their replies.


Hike pointed out that these were indeed kern.info messages, and not errors.
As such, it was simply the system informing me that there had been a memory
error but the system successfully corrected it.  Dean reinforced the fact
that this is a lot like the Persistent, correctable memory errors that we
occasionally see.  As long as they don't occur in rapid succession, there is
usually not a problem.  Eric stated that he had seen this before on a batch
of V280Rs they he had at his establishment that apparently had some weak
CPUs.  They passed original QA, but had to be replaced after being placed
into production and running for a while under a load.  I've seen that in
E220Rs, too, but that was a long time ago.

I simply let the customer know there was no problem here, but that we will
monitor the situation to see if the messages occur more frequently.  I
haven't seen a repeat yet.

Two documents I found a good read:

Soft Memory Errors and Their Effect on SunFire Systems (thanks Eric), and
Supplement to Sun Fire 6800/4810/4800/3800 Systems Troubleshooting Manual
(Further description of ecache and WDC events)

Thank you, all.

Gene Beaird,
Unix Support Engineer,
Pearland, Texas


No virus found in this outgoing message.
Checked by AVG.
Version: 7.5.524 / Virus Database: 269.24.0/1462 - Release Date: 5/23/2008
7:20 AM
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
Received on Fri May 23 14:18:04 2008

This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:44:11 EST