Thanks to everyone for their detail explanation especially Crist Clark Eric Voisard Casper Dik Gordon Johnston Hutin Bertrand Explanation: Crist Clark ------------ CLOSE_WAIT means that the local end of the connection has received a FIN from the other end, but the OS is waiting for the program at the local end to actually close its connection. The problem is your program running on the local machine is not closing the socket. It is not a TCP tuning issue. A connection can (and quite correctly) stay in CLOSE_WAIT forever while the program holds the connection open. Once the local program closes the socket, the OS can send the FIN to the remote end which transitions you to LAST_ACK while you wait for the ACK of the FIN. Once that is received, the connection is finished and drops from the connection table (if you're end is in CLOSE_WAIT you do _not_ end up in the TIME_WAIT state). Eric Voisard ------------- Afaik, there is no ndd parameter which affects the tcp CLOSE_WAIT duration. There was "tcp_close_wait_interval" but it has been obsoleted and renamed to "tcp_time_wait_interval" because in reality it affects the TIME_WAIT timeout and not the CLOSE_WAIT. So, you can try to change it but I doubt it'll have any effect since they're different things... Otoh, from what I know, it's the responsibility of an application (i.e. not to the OS) to close its socket once the remote computer closes its side of the TCP communication. RF793 says CLOSE_WAIT is the TCP/IP stack waiting for the local application to release the socket. So, it hangs because it has received the information that the remote host has initiated a disconnection and is closing its socket, upon what the local application did not close its own side. So maybe the solution consists in finding a bug fix for your application... Or more dangerously because they still have right to send remaining data in queue, to kill processes in CLOSE_WAIT state... Casper Dik ----------- CLOSE_WAIT connections indicate an error in the software. It's a connection which has been torn down but your side of things still has a filedescriptor open. Gordon Johnston ------------------- I believe CLOSE_WAIT on the server side of the connection means that the server has received a FIN from the client, will have acknowledged this back to the client and then informed the application that it can close the connection. It is then up to the application to relinquish the connection once it is satisfied that all the data has been read from the connection. Once it relinquishes the connection the server will send a final FIN back to the client and the connection will be fully closed. If you are seeing a large number of connections persisting in CLOSE_WAIT state it's probably a problem with the app itself, restarting it will clear the connections temporarily but obviously further investigation will be required to find the cause of the problem. Hutin Bertrand ----------------- look at : http://docs.sun.com/app/docs/doc/806-7009 My original post was: >Hi Gurus, >When i perform netstat -a, i saw the hundreds of connections are in >CLOSE_WAIT state. This causes my named-xfer using these connections to >sleep, truss -p <process_pid>. >Is there a timer to set, say after 120 seconds the CLOSE_WAIT >connections will break so my program can reconnect again?? For example >the "ndd" command?? >Any help is greatly appreciated. >Best Regards >shahb >_______________________________________________ >sunmanagers mailing list >sunmanagers@sunmanagers.org >http://www.sunmanagers.org/mailman/listinfo/sunmanagers _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagersReceived on Tue Jan 31 12:27:30 2006
This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:54 EST