My question:
>I'd like to hear from experts about what NFS options are recommended
>for machines that must remain impervious to other machines crashing,
>hanging, etc. The ones below have proven not adequate for some reason.
>The retry=3 seemed to cure the things when a machine was down when
>the reboot occurred, but when a machine crashed (Convex), the up machine
>(sun 3/280S) would hang up during new logins, immediately after putting
>out /etc/motd. I hate having to change /etc/fstab on multiple machines
>(removing NFS mounts for down machines) until they are all up again.
>This has required reboots of production machines and simply can't be
>tolerated.
>
>/etc/fstab:
>machine:/u1 /u1 nfs rw,soft,retry=3 0 0
>
>rw is necessary. It seems that /etc/exports options cannot be causing
>the problem.
>
>What is the consensus about using bg mounts? Can they get you into any
>trouble under certain circumstances, or not as much as fg mounts?
>
>Would some combination of retrans=N1,timeo=N2 be helpful?
>
>Thanks for the info. I will summarize the results.
>John
>yates@a.chem.upenn.edu
My solution from suggestions below: (I'm sure more answers will trickle
in, but unless they offer something not covered below I won't post them,
thanks all!).
node:/usr1/mlk /usr1/mlk nfs rw,soft,noquota,retry=3,timeo=10,retrans=10,bg 0 0
node:/usr2/jkb /jkb nfs rw,soft,noquota,retry=3,timeo=10,retrans=10,bg 0 0
This indeed timed out in a couple minutes for each file system, and did
not hang during new logins while the other node was down. Now node is up
and crunching, so one test is all I got, but it looks good! John
The detailed responses:
>From: IN%"mikulska%ece@ucsd.edu" 23-APR-1990 02:44
>To: YATES@a.chem.upenn.edu
>Subj: Re: fstab options for NFS mounts
>
>One thing you may take into acount is: check in rc.local if quotaon is
>enabled. If so, it is valid for _all_ filsys's, both local and nfs.
>Now if the 'exporter' crashes, the 'importer' will spend inordinate amount
>of time trying to check quota on the nfs file system(s) imported from the
>dead machine. It may look like your machine hangs completely - but most
>of the time, if you go for a long lunch, you'll find out that eventually
>you are logged in (not always, but often). Therefore, add 'noquota' to the
>nfs options in /etc/fstab for nfs mounted directories.
>
>We also use 'bg' with no adverse effect whatsoever.
>
>Maybe, just maybe, this will help.
>
>Margaret Mikulska
>system administrator
>
>UC San Diego
>Dept. of Electrical and Computer Engineering
>
>mikulska@ece.ucsd.edu, mem@ece.ucsd.edu
>
>From: IN%"fed!m1rcd00@uunet.uu.net" 23-APR-1990 08:15
>To: "YATES, JOHN H." <uunet!a.chem.upenn.edu!YATES@uunet.UU.NET>
>Subj: Re: fstab options for NFS mounts
>
>
>> I'd like to hear from experts about what NFS options are recommended
>> for machines that must remain impervious to other machines crashing,
>> ...
>>
>> /etc/fstab:
>> machine:/u1 /u1 nfs rw,soft,retry=3 0 0
> ^^^^^^^
>Right here is your problem. We have over 130 Suns here at the Federal
>Reserve, and our average machine has on the order of 60 or more NFS,
>hard, rw mounts. The problem is that the file-system access
>routines (I think it goes something like this, I have only an emperical
>understanding of the problem) will touch *every* subdirectory in a given
>directory when trying to read just *one* subdirectory. And when it tries
>to touch the mount point for a filesystem mounted off a crashed machine,
>the accessing process goes into the "D", or disk wait state.
>
>There are a few things you can do. One is to add the "intr" option to
>your mounts. This will allow you to "break" out of the D state, although
>it can take as much as a couple minutes. Another thing is to not have
>all your NFS mounts in the same directory. We have directories of
>mount points at the root level, orgainzed by administrative work group.
>Then if one group goes down, the whole group goes down, but other groups
>can continue to work.
>
>You can try using the automounter, but that has some serious sortcomings
>that could cause you other problems. (the last time I looked at it,
>there was now way to shut down an automount) There is a "freeware" version
>of the automounter called "AMD" available from various archives such as
>uunet, that I think is quite a bit better, although I haven't tested it
>at all. An automounter will help by making it less likely the offending
>file system is mounted when the crash occurs.
>
>A couple of other points. You should probably *never* mount a rw
>filesystem soft. The weak protocol can result in filesystem corruption
>in a heavily used or flaky network. Backgrouded mounts are, as far
>as I know, totally harmless and a Good Thing. The options that we use
>on 95% of our mounts are: rw,hard,intr,bg. Note that the bg should be
>listed last. According to one Sun engineer I talked to, the parsing
>code ignores all options listed after the bg. Don't know if it ever
>was or still is true, but it don't hurt to list it last.
>
>We only ever jack up retries on filesystems with lots of heavy use.
>Our database filesystem, and the filesystems we ship overnight backups
>to are examples.
>
>Finally, I will point out that people with lots of NFS pretty generally
>wind up coming up with some automated distribution of fstab. We have
>a YP and cron driven scheme here that works out ok.
>
>Good luck,
>Bob Drzyzgula
>Federal Reserve Board
>rcd@fed.frb.gov
>From: IN%"jam@philabs.Philips.Com" 23-APR-1990 08:29
>To: "YATES, JOHN H." <uunet!a.chem.upenn.edu!YATES@uunet.UU.NET>
>Subj: Re: fstab options for NFS mounts
>
>You're on the right track. I would try
>
>machine:/u1 /u1 nfs rw,soft,bg,noquota,retry=3,timeo=10,retrans=10 0 0
>
>noquota is a possible gotcha as I believe the default is quotas on and
>they will be checked during login. You might want to combine this with
>changing /usr/ucb/quota to true. This combination of timeo and retrans
>should result in a maximum of a 10 second wait. timeo may need to be
>larger on a busy net or if it is going through a gateway/router to get
>to the destination machine.
>
>Murf
>--
>John A. Murphy jam@philabs.philips.com
>345 Scarborough Road
>Briarcliff Manor, NY 10510 One one-trillionith of a surprise: picaboo
>(914)945-6216
>From: IN%"trinkle@cs.purdue.edu" 23-APR-1990 09:27
>To: "YATES, JOHN H." <YATES@a.chem.upenn.edu>
>Subj: Re: fstab options for NFS mounts
>
>>> What is the consensus about using bg mounts? Can they get you into any
>>> trouble under certain circumstances, or not as much as fg mounts?
>
> Background mounts are only relevant for the initial mount - it
>has no impact on NFS operation after that. We use bg everywhere that
>is not essential so that servers will actually come up if all of them
>go down (power failure, etc). Otherwise, they would be in deadlock at
>boot time.
>
>Daniel Trinkle trinkle@cs.purdue.edu
>Dept. of Computer Sciences {backbone}!purdue!trinkle
>Purdue University 317-494-7844
>West Lafayette, IN 47907
>From: IN%"cbarry@BBN.COM" 23-APR-1990 09:28
>To: YATES@a.chem.upenn.edu
>Subj: fstab options for NFS mounts
>
>
> Sender: sun-managers-relay@eecs.nwu.edu
> Date: Sun, 22 Apr 90 22:53 EST
> From: "YATES, JOHN H." <YATES@a.chem.upenn.edu>
> X-Vms-To: SUN-MAN,YATES
>
> I'd like to hear from experts about what NFS options are recommended
> for machines that must remain impervious to other machines crashing,
> hanging, etc. The ones below have proven not adequate for some reason.
> The retry=3 seemed to cure the things when a machine was down when
> the reboot occurred, but when a machine crashed (Convex), the up machine
> (sun 3/280S) would hang up during new logins, immediately after putting
> out /etc/motd. I hate having to change /etc/fstab on multiple machines
> (removing NFS mounts for down machines) until they are all up again.
> This has required reboots of production machines and simply can't be
> tolerated.
>
> /etc/fstab:
> machine:/u1 /u1 nfs rw,soft,retry=3 0 0
>
> rw is necessary. It seems that /etc/exports options cannot be causing
> the problem.
>
> What is the consensus about using bg mounts? Can they get you into any
> trouble under certain circumstances, or not as much as fg mounts?
>
> Would some combination of retrans=N1,timeo=N2 be helpful?
>
> Thanks for the info. I will summarize the results.
> John
> yates@a.chem.upenn.edu
>
>My understanding of your problem is that the down "machine" is the one
>listed in fstab, rather than the one whose fstab file you supply. If
>"machine" is down and other machines are trying to nfs mount it should
>give up eventually w/ nfs timeout message seeing as you enforce the
>'soft' option.
>
>Logins typically hang when remote machines' nfs partitions are being
>queried for quota checks by a local host as its coming up to
>multi-user, but in my experience this occurs only when you explicitly
>specify 'quota' option. BTW, which version of the sunos are you using?
>
>bg mounts are fine and actually a useful workaround to this hanging
>problem. We used to use them before the advent of the automounter.
>
>-chris
>
>From: IN%"viktor@math.Princeton.EDU" "Viktor Dukhovni" 23-APR-1990 09:46
>To: "YATES, JOHN H." <YATES@a.chem.upenn.edu>
>Subj: Re: fstab options for NFS mounts
>
>> I'd like to hear from experts about what NFS options are recommended
>> for machines that must remain impervious to other machines crashing,
>> hanging, etc. The ones below have proven not adequate for some reason.
>> The retry=3 seemed to cure the things when a machine was down when
>> the reboot occurred, but when a machine crashed (Convex), the up machine
>> (sun 3/280S) would hang up during new logins, immediately after putting
>> out /etc/motd. I hate having to change /etc/fstab on multiple machines
>> (removing NFS mounts for down machines) until they are all up again.
>> This has required reboots of production machines and simply can't be
>> tolerated.
>
> Replace "/usr/ucb/quota" with a symbolic link to "/bin/true"
>The logins are hanging trying to collect quota information from the remote
>server.
>
>--
> Viktor Dukhovni <viktor@math.princeton.edu> : ARPA
> <...!uunet!princeton!math!viktor> : UUCP
> Fine Hall, Washington Rd., Princeton, NJ 08544 : US-Post
> +1-(609)-258-5792 : VOICE
>From: IN%"jb@MCL.Unisys.COM" 23-APR-1990 11:46
>To: YATES@a.chem.upenn.edu
>Subj: Re: fstab options for NFS mounts
>
>john,
>we experienced similar type problems and resloved them
>by fixing the users path to NOT include any directories
>on the NFS mounted file systems. it seems the login
>process was trying to hash the filenames in the path
>directories and getting stuck (even with soft mounts).
>
>p.s. never had any problem with background mounts, we
>always use them so systems don't have to come up in a
>certain order.
>
>
>jb
>jb
>--
>John Borders Unisys - Advanced Development Branch
>Internet: jb@mcl.unisys.com 8201 Greensboro Dr. Mclean, VA 22102
> (703) 847-3287 (Voice) 448-1826 (FAX)
This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:05:57 CDT