Hi, Sun Managers! The problem has been solved. It turns out that the replacement IB (Interface Board) was either defective, or the wrong revision. Here's where it gets interesting; it turns out it was the A5200 I was having problems with, not the A5000. Whoops. :) But the time I'd gotten to the data center, it was no longer flickering the lightnight bolt, but reporting a failed IB. The replacement IB, which the vendor said would work with either the A5000 or the A5200 (but invoiced as "A5000 IB") is stamped with Sun Part Number 340-4069-04, and stickered "-06 REV 52.) To fix it, I used an IB from my lab A5200, marked with the same part number but stickered "-07 REV 50". Ironically enough, the one from the lab is date coded 98/51 while the replacement is date coded 99/47. I also found that one of the IBM GBICs connected to the hub for that channel (going to an HBA) had failed. I wonder if the flickering lightning-bolt-state is hard on the equipment, or if there are Gremlins in the system? I received some excellent advise when trying to fix this problem: Octave Orgeron: - Double-check firmware revisions in HBA, A5000, IB. - Double-check GBIC with a loopback cable. - Patch matrix for A5x00, HBAs, etc. here: http://sunsolve.sun.com/pub-cgi/retrieve.pl?doc=finfodoc%2F43212&zone_110=43212 Scott Mickey: - Try the A5200 IB from your lab. (Good advice!) - Note that while the Sun part number for A5000 and A5200 IB's are the same, I think the revision levels are different, so IB's from A5000's should not be deployed in A5200's (I think we have a winner!) - Did you know that many datacenters replace their fibre once a year? (No, I didn't, I think mine will.. we only rolled out FCAL in Sep/03) - A5000 Configuration Guide: http://docs-pdf.sun.com/805-0264-15/805-0264-15.pdf - Sun X6732A hub is actually a Vixel 1000 (they even say "Vixel" on the bottom) The Vixel manual is here: http://www.sms.com/support/Vixel/Rapport%201000/InstallGuide_00041017-001_D.pdf - You should power up the Vixel hub before the rest of the equipment (I didn't know that, but I had been doing it that way "by luck" -- as the hubs have no power switches) - Check your logs for messages (wow, it filled up /var/adm..): Jan 11 09:51:18 zaphod scsi: [ID 243001 kern.info] /pci@1f,4000/SUNW,ifp@2 (ifp0): Jan 11 09:51:18 zaphod Loop reconfigure in progress Jan 11 09:51:18 zaphod scsi: [ID 243001 kern.info] /pci@1f,4000/SUNW,ifp@2 (ifp0): Jan 11 09:51:18 zaphod LIP reset occured; cause f801 Jan 11 09:51:18 zaphod scsi: [ID 243001 kern.info] /pci@1f,4000/SUNW,ifp@2 (ifp0): Jan 11 09:51:18 zaphod Loop reconfigure done Jan 11 09:51:18 zaphod scsi: [ID 243001 kern.info] /pci@1f,4000/SUNW,ifp@2 (ifp0): Jan 11 09:51:18 zaphod LIP occured; cause f801 Jan 11 09:51:18 zaphod scsi: [ID 243001 kern.info] /pci@1f,4000/SUNW,ifp@2 (ifp0): Also, I learned one more tidbit from the A5000 troubleshooting PDF; you're supposed to use the GBICs in a particular order in the Vixel hubs to prevent signal degredation. I didn't change any of my running hubs (which are using ports 1, 5, and 6) but I clustered the hub connected to the broken IB such that it was using ports 3, 4, and 5, just in case. Thanks a million, guys! Wes -- Wesley W. Garland Director, Product Development PageMail, Inc. +1 613 542 2787 x 102 -- _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagersReceived on Wed Jan 14 14:12:08 2004
This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:29 EST