Many thanx for the replies. It appears that there may be many reasons for these errors. All the replies are included as all of them contain very valuable information. I think the bottom line is SANs are still in their infancy and long way to go until the maturity. The situation is complicated as there are so many parties involved: HBA vendors, Storage Vendors, Disk Management software vendors, Switch Vendors and the computer manufacturers. Care should be taken whenever work is done on the SAN contrary to the glossy pictures presented by the SAN vendors. If possible keep mini SAN islands for production/live, development and test. I know this defeats the purpose of the SAN world, but we should realise the reality of the limitations of various components that made up a SAN and consequences when things horribly go wrong. Best Regards, Ron. 1. From Ken, We don't have the budget for HDS, but we have experienced an identical problem with JNI FCE-1063s, Brocades, VM, and Clariion JBODs. The problem is caused by too many asynchronous I/Os going to a single disk and overflowing the buffer in the disk. In our case, we set sd_max_throttle in /etc/system to 8 (the default is 256 if there is not sd_max_throttle entry) and the problem went away. While researching our problem, I think I read somewhere that you should set sd_max_throttle to 16 for HDS and JNI (though a friend of mine said Hitachi told them to set it to 2). Let me know if it helps, 2. From Johan I'm having problems as well. but my problems are slightly different symptoms. The solution is download the latest JNI drivers, eg 2.9.11 for the FC64-1063. Make sure your OBPs are patched. Make sure your kernel is patched. For Sol 2.6, add this to the /etc/system file: set kobj_map_space_len=0x200000 This line above helps for reconfiguration to complete properly! make sure you have this in /etc/system on all versions of Solaris: * Hitachi Disk / JNI HBA settings set sd:sd_io_time=0x3c set sd:sd_max_throttle=8 * End of JNI HBA settings In your sd.conf, put only the entries you need. ZONE ZONE ZONE. Put every host in it's own zone if possible. If there is 2 HBAs in a host, give it two zones. This has given me stability on my SAN. And there is good news : Sun and JNI have committed to start working together to resolve our woes. 3. From John We have the same HBAs in our servers, but don't get the errors when rezoning. We're running version 2.5.9 of the JNI drivers with Brocade Silkworm 2800s and Compaq storage. Absolutely no problems. We do see the pause, but we don't lose the LUN information. Are you running any kind of multipathing software? 4. From Mohamed Since many servers are loosing LUN I will focus on the switches and Storage We are using mcdata switches here with an IBM ESS I have had problem showing tran_err and it was caused by the SToarge HBA card -------------------- talk21 your FREE portable and private address on the net at http://www.talk21.com _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagersReceived on Fri Nov 29 10:58:58 2002
This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:42:59 EST