Thanks to all who responded to this odd one. It turns out that we had a
fortran program with the following array definitions
DIMENSION U1(5000), V1(5000), U0(5000), V0(5000)
DIMENSION SIGNAL(100,100)
Then there was a loop that went to 75,000,000 - 6 times and accessed the n'th
element of each array. Speculation has it that a page request occurred between
each array change or maybe each pair. To make things worse this student had
four copies of slighty modified versions of the same program all running at the
same time.
I regularly keep the disk perfmeters up so that if a number gets extreme I can
catch the situation before someone really starts to yell. But apprently page
requests do not show up even though they do hit the disk. Or is it that there
were page requests and they were being entirly being serviced out of the cache
and the disk was not impacted? I still don't know that one or why the disk
activity on the perfmeter stay so low. What are the other numbers on the
perfmeter contxt, swap and page and could they have been used to catch what
was going on?
After the programs were terminated the server went back to being happy and the rest of the world breath a sigh of relief.
>Questions:
>What is the INODES line of the pstat and what does it mean when it is
>maxed out? Can it be enlarged?
This comment was from Greg Earle at Sun Microsystems, Inc. -
JPL on-site Software Support Hey Sun is out there and is listening and HELPS!
The `Inodes' entry in `pstat -T' shows the size of the in-core inode table
cache (as opposed to `inodes available on the disk partitions'). For the
most part it should be full or close to full. The only way to easily
increase it is to build a new kernel with `maxusers' set higher, which you
seem to have already done.
Wyman chong sent the following
One thing I found out that the max number is governed in param.c under the
variable number
ninode = (NPROC + 16 +MAXUSERS)+64
which means is governed by NPROC and the number of MAXUSERS. So it is easy to
increase. The question is whether it would buy anything by increasing it and by
how much. Please let me know if you find out anything.
>Does increasing this number buy anything? Maybe
... For our main servers, though, we ran into lots of
trouble running out of inodes. Knowing that the inodes (as well as
all of the "pstat -T" stuff) are handled in the kernel, I did some
research & found that param.c in /usr/share/sys/sun3/MERCEDES (the
name of one of the servers) contains the variable ninode. Normally,
this is set up to work with the number of processes & the MAXUSERS
constants to figure out some rule-of-thumb value. TFM recommended
that values such as ninode be changed by increasing MAXUSERS, but I
didn't like that idea since it also increases a bunch of other values.
So, I moved the original value into a comment & forced a bigger value
into ninode as a constant. (It was around 600 or 800 some-odd at the
time by virtue of prior increases using MAXUSERS. Enough of that
garbage. I bumped it to 1000.) The requests for inodes jumped at
them, claiming quite a lot, but leveled off at that time to over 900
on one of the servers. The other server needed more, so I bumped it
to over 2000. I seem to recall that its requests leveled off at over
1500. Here is how they look now:
heel:/home/reznick% rsh head pstat -T
366/1235 files
822/1000 inodes
94/330 processes
21/ 64 files
9048/20688 swap
heel:/home/reznick% rsh mercedes pstat -T
174/1017 files
1833/2048 inodes
49/266 processes
7/ 64 files
5568/22992 swap
heel:/home/reznick%
As you can see, we're using a lot of inodes on those servers, but
we're not having any trouble any more. There may be a better solution
than editing param.c & then remaking the kernel, but I couldn't find
one & this worked. (BTW, head is a 3/160 and mercedes is a 3/260.
They're serving 2/50s, 3/50s, & a 3/75.)
Lawrence S. Reznick, ribs!reznick@sacto.West.Sun.COM
John R. Deuel <kink@rice.edu> posed the following question
How did you get 4 disks on a single 451? You have two 451's, right?
> We do in fact have four disks on the one disk controller. Since most of
> our disk is used to store images that get used off and on but not off enough
> for tape a large amount of disk on an existing controller works fine. The
> way that we did it was to go directly to the board and use the connectors
> found there and ignore the ones that sun brings out to the back of the box.
> After a simple kernal change to tell xy2 and xy3 where to look every thing
> worked great. If you have need to dive into this further drop me a note.
Thanks to all who responded
Greg Earle Sun Microsystems, Inc. earle@Sun.COM
Martin Fredriksson martin@molndal.ericsson.se
Lawrence S. Reznick ribs!reznick@sacto.West.Sun.COM
Wyman chong wyman@atherton.com
John R. Deuel kink@rice.edu
Lyndon Nerenberg lyndon@cs.AthabascaU.CA
don
--------------------------------------------------------------
Don Baune Internet: don@doug.med.utah.edu
University of Utah
Medical Imaging Research Lab
AC-215 School of Medicine talk: (801) 581-6088
Salt Lake City, Utah 84132 FAX: (801) 581-2414
--------------------------------------------------------------
This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:05:56 CDT