Verily, this list doth rock most righteously. Thanks to everyone who replied to my question. The winning Kudo goes to Doug Winter for correctly spotting that the cause is the following line in /etc/ssh/ssh_prng_cmds: "arp -a -n" /usr/sbin/arp 0.02 The arp command is just one of many commands OpenSSH runs to gather entropy for it's randomizer, and this particular command was hanging. The -n command isn't supported under Solaris, and there were a number of hosts in the arp table that didn't have a name in reverse dns. (In fact, this problem was already reported on the archived openssh-unix-dev mailing list... Silly me.) Commenting out the offending line from /etc/ssh/ssh_prng_cmds did the trick. My ssh connect time went from 1:30 to 0:08. Thanks! Michael > Hello all, > > I have just upgraded ssh on my Solaris 8 system and everyting works > wonderfully except on three systems. On these three systems ssh hangs for 37 > seconds when trying to ssh from one of these three systems to anywhere else. > I believe I have tracked this problem down, but I don't understand the cause. > > Using truss (with a rather extreme set of options: -f -a -e -l -d -tall -vall > -xall -sall -mall -rall -wall -uall) I see the following: > > 18785/1: 1.4638 open64(0xFF226358, 0) = 6 > 18785/1: 0xFF226358: "/etc/.name_service_door" > > ... > > 18750/1: 1.5779 close(6) = 0 > 18785/1: door(6, 0xFFBED430) (sleeping...) > 18750/1: waitid(0, 18785, 0xFFBED748, 03) (sleeping...) > 18785/1: 38.0478 door(6, 0xFFBED430) = 0 > 18785/1: 38.0481 door(6, 0xFFBED4C8) = 0 > > If I am reading this right, the timstamps show that between the close(6) > (timestamp 1.5779) and the second door(6, 0xFFBED430) (timestamp 38.0478), > there's a 37 second delay. According to the manual page for truss, the > timstamps signify the completion of the command, which means, if I am correct, > that the cause is that second door(6, 0xFFBED430) on /etc/.name_service_door. > > This only happens on these three hosts. So my question is: > > (1) is my diagnosis correct in thinking that the problem is with > /etc/.name_service_door? > > and > > (2) what uses /etc/.name_service_door? > > (This is confusing my (l)users into thinking that there's something wrong with > their account, and they're griping to me about it.) I'd like to restart > whatever service is causing the slowdown, but I don't know what it is, and > there is no mention of .name_service_door in any of the Answerbook2 libraries > or man pages. (I thought I would be slick and look at the inode of > /etc/.name_service_door and then look for that inode in /proc/*/fd/*, but > there are an awful lot of programs that have something open to that door!) > > Any ideas anyone? Should I just "punt" and reboot them? > > Thanks for your input, > > Michael Peek > > > Michael Peek peek@tiem.utk.edu > ------------------------------------------------------------------------------ > Systems Administrator / C++ Database Programmer 569 Dabney Hall > Department of Ecology and Evolutionary Biology Knoxville, TN 37996-1610 > University of Tennessee at Knoxville > ------------------------------------------------------------------------------ > (865)974-0224 phone, (865)974-3067 fax http://www.tiem.utk.edu/~peek _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagersReceived on Thu Aug 8 15:15:20 2002
This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:42:51 EST