> Special thanks to Nasser Manesh: > Sorry about the delayed summary. > Answer 1 from Nasser: "system time" in top or uptime or time refers to system calls in general. There are a few processes that dive into kernel mode and never returned, hence commonly known as kernel processes, but I doubt if they are the source of problem for you. scheduler, page daemon and filesystem update daemon are the three main ones (sched, pageout and fsflush on Solaris - pid 0, 2 and 3). Also nfsd (NFS server process) needs to run in kernel mode (either multiple processes or multiple LWPs depending on the version of your Solaris), but you're talking about an NFS client. So check the output of vmstat and vmstat -i for system calls and interrupts. Network interrupts because of a lots of connections (e.g. a system with high volume of short-lived TCP connections - web server, proxy server, etc), serial line interrupts (bad I/O board?), etc. Run prstat to see who's usually on top. Truss them to see who's doing a lot of sys calls. If you do not mind sharing the outputs I can take a look at the outputs and tell you if I see something out of wack. > Answer 2 from Nasser: If you are running 2.6 things could be a bit cloudy because of the way the filesystem buffer cache consumes the whole memory. Then an output of /usr/ucb/ps axu can be a close replacement for prstat. Truss traces the system calls a specific process issues (optionally including its children), which is basically the SVR4 replacement for the good old "trace" or "ktrace". A good starting point is (assuming you get a suspicious PID from ps) to do this: # truss -o /tmp/truss.out -f -p <PID> To write the output to a file, follow forks (and report children) and attach to process identified by <PID>. You may want to try this without -o <file> first just to see on screen how fast your process makes system calls. Constantly making system calls is not good, usually there's no reason for that. Truss will stay attached as long as the process runs, so since you're presumably trussing a daemon you'll have to kill it a few seconds after starting it. Just hit CTRL-C (or whatever your terminal interrupt is) and truss will die. It will not harm the process and is safe to run. > Answer 3 from Nasser: It's also a good idea to check your console, dmesg and /var/adm/messages file, to see if you do not get excessive interrupts because of a hardware failure (like I said, serial port is a famous one for which you may see errors from zs0). > Answer 4 from Wolfgang Kandek: There is a Veritas vxstat command that can give you more detailed information on a logical volume basis (also raid-5 statistics, in case you are using that) that might give you some more information. Looks to me that you have some heavy I/O on these disks, NFS also has a tendency of increasing the time spend in kernel (system) mode. There is also nfsstat that could give you some further clues on the type of operations that are being used most frequently - nfsstat -z zeros the counters then after some time (1 minute ?) call again to get an idea on the frequency that read/writes, directory lookups are used. > Answer 5 from William Hathaway: The nfsXX are nfs mount points, not local disk, you are probably better off using nfsstat c to troubleshoot them. Solution from myself: By using "iostat -xpn", I found the mounted file system which had the problem. By adding two more CPUs, the situation has turned better. Original Question: Hi, My previous question: We are using Solaris 2.6. I have a Sun system (an NFS client) having CPU problem now. The system/kernel processes have consumed 50% to 70% of the CPUs. Here is the info from TOP: "CPU states: 7.0% idle, 20.0% user, 72.9% kernel, 0.1% iowait, 0.0% swap Memory:1024M real, 479M free, 152M swap in use, 1639M swap free". The user processes seem fine in either 'top' or '/usr/ucb/ps -aux', and they don't consume much CPU. My new question: From the following output, it seems that disks 'nfs30' and 'nfs31' have some I/O distribution problems. Is it correct? How can I prove it? The system uses Veritas Volume Manager. # iostat -xc 5 ................. nfs28 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 nfs29 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 nfs30 277.9 0.8 199.9 5.8 0.0 0.3 1.0 3 25 nfs31 47.1 35.9 1413.1 1144.1 0.0 1.0 12.4 2 32 nfs32 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 nfs33 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 nfs34 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 ................. Thank you all! John Yahoo! - Official partner of 2002 FIFA World Cup http://fifaworldcup.yahoo.com _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagersReceived on Thu Jun 20 11:22:36 2002
This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:42:47 EST