Thank you to both Val Popa and Mr. Krenzischek for their help on this issue. I've setup a script to grab the output of the swap -l command on regular intervals to determine if we are actually seeing "shrinkage" or not. After I look at that data, I'll see if I need to go for some of the tools that Ryan speaks of. Thanks! Here is Val Popa's reply: To see the actual swap the correct command is : swap -l. If this command shows swap=0 then and only then you have run out of swap, else, read below If df -k shows that /tmp is getting full, does not mean that you're running out of swap, rather /tmp is beeing accessed by someone/something else and perhaps a log or some sort of file gets created, which will cause df -k to show /tmp at 100% or something allong these toughts. To verify do this: cd /tmp du -sk * See the sizes and you have found where the bottleneck is. Go there and trace it back to what caused it. V --------------- And the one from Mr. Krenzischek: Check out memtool at http://playground.sun.com/pub/memtool Also try running the BSD ps under /usr/ucb. Pay particular attention to the MEM and RSS columns. The RSS size is the resident size defined for a process in RAM. You should make sure that these numbers are within reasonable size. The other item you might want to take a look at is what programs access file systems mounted as type tmpfs. It might not be a memory leak. A program that might be writing to /tmp might be unlinking a file without first releasing a open read/write fd. Have you considered running sar? You can record events then play them back in realtime to exactly diagnose the time (e.g. if a certain batch process runs) for which the most swap pages are requested. Eventually, those pages should be returned after a process finishes up. And of course, those pesky developers always have a tendency to forget that they implemented a change. Have you verified with your development/applications group if anything has recently changed? For example, I manage certain boxes but the DBAs manage sybase/oracle. They can install a new version of ASE or Oracle RDBMS without my assistance. Check your crontabs. I have had instances where I wrote scripts to monitor a process and it just kept on re-spawning itself. Unfortunately, it took a 6-8 of hours for it to be noticable so it was not apparent at first that a small script was not properly exiting and releasing the memory. Over time, that does increase. I hope this helps. Good Luck. Ryan Brian D. Smith -----Original Message----- From: Smith Brian D CONT CNIN Sent: Tuesday, January 06, 2004 1:35 PM To: sunmanagers@sunmanagers.org Subject: Swap space leak on Clustered E450's with Solaris 8 We have noticed the following problem on nearly every one of our Sun Cluster 2.2 clusters. Each cluster is a three node cluster, with each node being an E450 running Solaris 8. They have been running in this configuration for several years. We have recently noticed that the swap space shrinks over time. By this, I mean that when you do a 'df -k', the total space for swap gets smaller and smaller. Eventhough Sun support doesn't believe us, we ARE NOT seeing a process USING all of the swap space, we are seeing the actual total amount of swap space shrinking. The swap space will eventually shrink to the point that no swapping or writing to /tmp can be done at all. I've looked through every FAQ, manual and website that I can find on the subject, but find nothing on the shrinking swap space. Thus far Sun support has been of no help. I will summarize after I have received replies. Thanks, Brian D. Smith _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagersReceived on Tue Jan 6 14:36:39 2004
This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:27 EST