Thanks to everyone who replied and shared great ideas of what might have cause the problem: topher Darren Dunham Gaziz Nugmanov tony bourke Vlade Steve Mickeler Almost everyone advised to double-check 100 FullDuplex settings on all NICs and switch ports, and the settings were correct. topher proposed to try out the next steps: > My first step would be a "showmount -a" - and make sure that there aren't any > +systems in there that have been turned off - I found a BUNCH of packet errors > +while the system was trying to figure out NFS connections - something along the > +same line as what you see now... you can manually edit /etc/rmtab and either > +comment out, or completely remove, the systems that are no longer in existance, > +then start and stop the nfs server (/etc/init.d/nfs.server > +stop;/etc/init.d/nfs.server start) and you should be all good to go (the nfs > +thing won't even drop the existing connections, worst case they get a blip of a > +message on the end users terminals, but that's it...) > > ---- > > My next step would be to increase the xmit and recv buffers with ndd: > > ndd -set /dev/tcp tcp_xmit_hiwat 32768 > ndd -set /dev/tcp tcp_recv_hiwat 32768 > > these default to 16,384 and 24,576 respectively, and the range for each is from > +4096-1,073,741,824 and 2048-1,073,741,824 respectively > > These parameters specify the default value for a connection's receive and > +transmit buffer space; that is, the amount of buffer space allocated for > +received data (and thus the maximum possible advertised receive window) - > +usually they should be set to be the same... > > try the 32768 number, and if they make the errors a little better, you can > +slowly increase it - just remember it's 'bytes', so if you make it too big, > +you'll clobber your RAM with tcp packets... that'd make things worse... The box was not an nfs-server, the 2nd step did not help as well. I also tried bringing down several network services running on that box - nothing changed. Finally I installed the latest 2.6 Recommended patches and rebooted the box, it solved the problem (I hope at least for the next ~600 days ;)). Thanks everyone for your help! Alex Initial message was: On Thu, Jun 06, 2002 at 12:13:59PM -0700, Alex wrote: > Hi, > > I have a Solaris 2.6 box with 2 hme NICs (active and standby) connected > to different switches. Starting from yesterday, netstat -in started showing > a lot of input errors on hme0. When I switched to hme1, the input errors > started to appear on it with the same speed (ratio is ~4.2%): > > # netstat -in > Name Mtu Net/Dest Address Ipkts Ierrs Opkts Oerrs Collis Queue > lo0 8232 127.0.0.0 127.0.0.1 36838233 0 36838233 0 0 0 > hme0 1500 10.16.0.0 10.16.0.110 3348788030 74996574 2221455560 0 0 0 > hme1 1500 10.16.0.0 10.16.0.110 304529 12973 176126 0 0 0 > > # ifconfig -a > lo0: flags=849<UP,LOOPBACK,RUNNING,MULTICAST> mtu 8232 > inet 127.0.0.1 netmask ff000000 > hme0: flags=862<BROADCAST,NOTRAILERS,RUNNING,MULTICAST> mtu 1500 > inet 10.16.0.110 netmask ffffff00 broadcast 10.16.0.255 > ether 8:0:20:8f:24:29 > hme1: flags=843<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 > inet 10.16.0.110 netmask ffffff00 broadcast 10.16.0.255 > ether 8:0:20:8f:24:29 > > Both nics are explicitly set to 100 Full Duplex, autonegotiation > is turned off (the same on switch ports). snoop is not showing anything > suspicious. The box's uptime is 649 days. > > Does anyone have any ideas what could be a cause of such ierrors? > Will summarize. > > Thank you! > > Alex _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagersReceived on Thu Jun 6 19:49:37 2002
This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:42:46 EST