Sorry for my late summary. The issue is still under investigation, I have received lots of valuable information. Thanks to : Christophe Dupre Mike Peppard Yura Pismerov Kent Perrier Tim Chipman Jay Lessert Kevin Buterbaugh Itiu Riddoch John And special thanks goes to Sid Wilroy who offered to provide dedicated help from his own time, he provided some test scripts attached here "guds script" along with a "Solaris Tunable Parameters" Reference Manual. Thanks Sid. Most answers contains valuable information, so I though it would be better if I paste some paragraphs as they are in the original post. ########### My original question , > We have an E420r server with 4GB RAM, 2xCPU running Solaris 8 that will act > as a new mail server. There is also an Intel machine running RH Linux 7.1 > for testing purposes. > Both machines running the same mail server application. > I ran a process which imports messages of 1000 accounts from an old mail > server to the mail server installed on the above machines one machine at a > time. > The process running on the Linux box took only 1.5 hours to complete > importing all the messages from the old mail server while on the Solaris box > it took 3.5 hours with the same set of accounts. > Network connectivity between the Sun box and the old mail server is fine at > 100Mbps/full duplex. Also there is noting special running on the system that > would have had any impact on the overall performance. > > I would like to know if there any settings or kernel parameters that to be > reconfigured to get the optimum Read/Write speed and performance out of the > Solaris box ? ############# Christophe Dupre You first need to figure out if the process is CPU bound, or I/O bound. It will tell you where you need to look to increase performance. Use iostat, vmstat and top. iostat will tell you if the I/O on the disk is not well balanced. vmstat will tell you if you run out of memory. top (or mpstat) will tell you how the CPUs are being used. If see a lot of iowait, that means the disk or network is holding things up. If your processors are running at 100%, then the CPU is not fast enough. Note that you have two CPUs, but most probably are using only one since most import programs that I know of are single-threaded. See if you can start two import jobs at the same time using two lists of old accounts. That should balance things on both processors. ---------------------- Mike Peppard Solaris is not very good at being a mailserver. It has to do with an unoptimized tcp stack and the way it handles disk access. Linux or BSD with softupdates are far better at this type of application, even with marginal equipment. As you discovered. ... use a pentium 200mhz with the fastest disk you can find dedicated to the var partition and run linux or bsd. It will be faster than a E450. I tried it myself. ----------------------- Jay Lessert 1) Well, the Linux box is what, a 1GHz PIII? A 1.5GHz P4? It would be very difficult to buy an Intel processor as slow as your E420. More likely your Linux box is 2X-3X faster. 2) HOWEVER, if you're running sendmail or postfix or qmail, you're not strictly CPU limited. These guys spend most of their time forking processes and creating/deleting lots of small files. But if you're running some strange MTA that is CPU-bound, there's your answer 3) On forking processes, bad news for your E420 again. Linux is faster, much faster. But that's still probably not it. 4) Creating/deleting lots of small files, that's probably it. The default file system for RH is ext2, and one of the defaults for ext2 is "async". This means if you create/delete a directory entry, you record that fact in buffer memory, but you don't push it to disk until some time when it's convenient. If you pull the power plug before the "convenient" time comes, you're in big trouble. It is *very* unsafe, but it's fast. You should *never*, *never* run a mail spool this way. Actually, you should *never* run *any* file system this way that has permanent data on it either, but most people do because they don't know any better. Did I mention it's fast? If this is the limit, you can test by remounting the mail spool with sync. This will make it look like pretty much like the Solaris ufs file system: % sudo mount -o remount,sync /var # or wherever your spool is... Write a script to create/delete a thousand files, you'll *know* when you're running sync or async! 5) Do all your Solaris ufs file systems have logging turned on? This is safe *and* fast (but not as fast as async). Solaris mount has no async option (it really is not safe), but you can turn it on as an experiment with fastfs (http://www.science.uva.nl/pub/solaris/fastfs.c.gz). -------------------------- Kevin Buterbaugh Probably the biggest factor in performance in this scenario is the disk subsystem. How is that configured (what type of disks / arrays, are you using Veritas or Solstice DiskSuite, what type of RAID, etc.)? Software RAID 5, for example, is horrible for write performance. ------------------------- Itiu What you've experienced is something that's common. Sun boxes are underperforming compared to intel boxes running either Linux or Windows. Just too bad. -------------------------- Sid Wilroy >From your problem description, I can't really tell what the bottleneck on your system is. I am going to attach a script called "guds" and instructions for running it. Performance troubleshooting consists of these steps: 1) Find a baseline for the system. How does it normally work, or how did it normally work? (Performance scripts can be used to obtain this measurement.) 2) What is "slow"? How much slower is it? When did this start? What applications run on the machine? Who perceives the problem? 3) The System Administrator runs performance scripts 4) Analyzes the performance scripts to identify the bottlenecks. 5) The System Administrator fixes the bottlenecks. 6) Rerun the performance scripts to see if the system has improved. If it has not, go back to step 3. I'm attaching a script called "guds" This script, ideally, should be run at a time when the machine is running poorly and a time when the machine is running well. Then comparison data can be obtained. Note: The script needs to be run on the same machine. You can not compare one machine to another. Please put this file on your unix box. Now, run the guds script. The script will ask you two questions that you need to answer. The first question is the number of iterations. If you haven't been told another number, please enter the number 5. The second question is the INTERVAL between interations. If you haven't been told a number, please enter 300. After the script is done running, it will make a tar file of the information for analysis. The script will tell you the name of the file. --------------------------- Regards, Hisham [demime 0.99c.1 removed an attachment of type application/pdf which had a name of Tunables.pdf] #!/bin/ksh VERSION="0.2" - Last Modifications on 10-15-2001 # vmstat, mpstat, iostat, run continuously # fixed packaging #VERSION="0.12 - Modification Date 03-14-2001" # Fixed typo, freemem not freemen # Set File Variable Names PID=$$ # PID of this process PDIR="/var" # Parent directory for perf data ADB="adb.out" NETSTAT="netstat" # Netstat data will be put in here IOSTAT="iostat.out" # iostat data will be put in here IOSTATXPN="iostatxpn.out" MPSTAT="mpstat.out" # mpstat data will be put in here VMSTAT="vmstat.out" # vmstat data will be put in here VMSTATI="vmstat-i.out" VMSTATP="vmstatp.out" # vmstat data will be put in here PS="psberkeley.out" # process informatino will be put in here PSA="psatt.out" # process informatino will be put in here MOD="modinfo.out" # modinfo stuff KMASTAT="kmastat.out" NFSSTAT="nfsstat.out" #nfsstatistics NETSTATA="$NETSTAT/netstat-a.out" NETSTATI="$NETSTAT/netstat-i.out" NETSTATK="$NETSTAT/netstat-k.out" NETSTATM="$NETSTAT/netstat-m.out" NETSTATS="$NETSTAT/netstat-s.out" NETSTATRN="$NETSTAT/netstat-rn.out" LOCKSTAT="lockstat.out" IPCS="ipcs-A.out" UPTIME="uptime.out" SARQ="sar-q.out" Check_for_Root () { id | grep root 1>&2 > /dev/null if [ $? -ne 0 ] then echo "You must be root to run this!" exit fi } # Check_for_Root adb_calls () { cd $PDIR/$DIR date >> $ADB echo "\nLotsfree" >> $ADB echo "lotsfree/K" | adb -k >> $ADB echo "lotsfree/D" | adb -k >> $ADB echo "lotsfree/E" | adb -k >>$ADB echo "\nMinfree" >> $ADB echo "minfree/D" | adb -k >> $ADB echo "minfree/K" | adb -k >> $ADB echo "minfree/E" | adb -k >>$ADB echo "\nAvefree" >> $ADB echo "avefree/K" | adb -k >> $ADB echo "avefree/D" | adb -k >> $ADB echo "avefree/E" | adb -k >>$ADB echo "\nFreemem" >> $ADB echo "freemem/D" | adb -k >> $ADB echo "freemem/K" | adb -k >> $ADB echo "freemem/E" | adb -k >>$ADB echo "\nAvefree30" >> $ADB echo "avefree30/D" | adb -k >> $ADB echo "avefree30/E" | adb -k >> $ADB echo "avefree30/K" | adb -k >>$ADB echo "\nK_anoninfo" >> $ADB echo "k_anoninfo/D" | adb -k >> $ADB echo "k_anoninfo/E" | adb -k >>$ADB echo "k_anoninfo/K" | adb -k >>$ADB } # adb_calls Iterations () { echo "" echo "Please enter the number of ITERATIONS that need to be done." echo "If you have not been given instructions for this number," echo "enter 5." read COUNT } Seconds () { echo "" echo "Please enter the NUMBER of SECONDS to wait between each iteration." echo "If you have not been given instructions for this number," echo "enter 300" read INTERVAL } Case_Number () { echo "" echo "Please enter your case number." echo "The case number should only consist of digits." read CASE } Create_Directories () { cd $PDIR if [ -r $DIR ] then rm -rf $DIR fi mkdir $DIR 2> /dev/null if [ $? -ne 0 ] then echo "Please change permissions so that" echo "the directory $DIR can be created." exit 1 fi cd $DIR if [ ! -r "$NETSTAT" ] then mkdir $NETSTAT if [ $? -ne 0 ] then echo "Please change permissions so that" echo "the directory $NETSTAT can be created." exit 1 fi fi cd $PDIR touch $IOSTAT if [ $? -ne 0 ] then echo "The file $IOSTAT could not be created." exit 1 fi echo "" echo "\tCreated the $IOSTAT file." mkdir $PDIR/$DIR/messages 2> /dev/null echo "\tCreated the messages directory." touch $UPTIME if [ $? -ne 0 ] then echo "The file $UPTIME could not be created." exit 1 fi uptime 1>> $UPTIME 2> /dev/null echo "\tCreated the $UPTIME file." touch $SARQ if [ $? -ne 0 ] then echo "The file $SARQ could not be created." exit 1 fi echo "\tCreated the $SARQ file." touch $IPCS if [ $? -ne 0 ] then echo "The file $IPCS could not be created." exit 1 fi echo "\tCreated the $IPCS file." touch $KMASTAT if [ $? -ne 0 ] then echo "The file $KMASTAT could not be created." exit 1 fi echo "\tCreated the $KMASTAT file." touch $IOSTATXPN if [ $? -ne 0 ] then echo "The file $IOSTATXPN could not be created." exit 1 fi echo "\tCreated the $IOSTATXPN file." touch $MPSTAT if [ $? -ne 0 ] then echo "The file $MPSTAT could not be created." exit 1 fi echo "\tCreated the $MPSTAT file." touch $VMSTATP if [ $? -ne 0 ] then echo "The file $VMSTATP could not be created." exit 1 fi echo "\tCreated the $VMSTATP file." touch $VMSTAT if [ $? -ne 0 ] then echo "The file $VMSTAT could not be created." exit 1 fi echo "\tCreated the $VMSTAT file." touch $VMSTATI if [ $? -ne 0 ] then echo "The file $VMSTATI could not be created." exit 1 fi echo "\tCreated the $VMSTATI file." touch $PS if [ $? -ne 0 ] then echo "The file $PS could not be created." exit 1 fi echo "\tCreated the $PS file." touch $PSA if [ $? -ne 0 ] then echo "The file $PSA could not be created." exit 1 fi echo "\tCreated the $PSA file." touch $NFSSTAT if [ $? -ne 0 ] then echo "The file $NFSSTAT could not be created." exit 1 fi echo "\tCreated the $NFSSTAT file." touch $MOD if [ $? -ne 0 ] then echo "The file $MOD could not be created." exit 1 fi echo "\tCreated the $MOD file." touch $LOCKSTAT if [ $? -ne 0 ] then echo "The file $LOCKSTAT could not be created." exit 1 fi echo "\tCreated the $LOCKSTAT file." touch $ADB if [ $? -ne 0 ] then echo "The file $ADB could not be created." exit 1 fi echo "\tCreated the $ADB file." } # Create_Directories #####################################################################33 #####################################################################33 #####################################################################33 #####################################################################33 #####################################################################33 #####################################################################33 # Get Required Input from User echo "" echo "Performance Script" echo "------------------" echo "" echo "This script will run on your system for an amount of time" echo "determined by the number of iterations and the number of" echo "seconds between iterations that you set. -- If you have" echo "not been given specific values, please enter in 5 for the" echo "number of iterations and 300 for the number of seconds." echo "That will cause the script to run for 20 minutes." echo echo "For the best analysis, you should run this script twice." echo "1) Run the script when the system is experiencing the performance" echo " problem." echo "2) Run it again when the machine is doing well." echo echo "This will allow the kernel engineer working on the call" echo "to compare bad data to baseline data. Some numbers are" echo "meaningless without baseline data, such as system calls," echo "and interrupts." echo echo "This script will gather data from your system and put it" echo "in a directory called /var/12345678, where 12345678 is" echo "your case number." echo "" echo "The data gathered will be things such as:" echo "- /etc/system \t\t- /etc/vfstab" echo "- /etc/release\t\t- vmstat data" echo "- iostat data\t\t- mpstat data" echo "- lockstat data\t\t- netstat data" echo "- ndd data\t\t- listing of raw disks" echo "\n\n\n" Check_for_Root bad=1 while [ $bad -ne 0 ] do Case_Number if [ "$CASE" -gt 0 ] then echo "Case Number is $CASE" else echo "bad: Invalid Number" fi 2>&1 | grep -c bad | read bad DIR=$CASE FILE="$PDIR/$DIR/$CASE-perfout.tar" # Output data will be stored here done bad=1 while [ $bad -ne 0 ] do Iterations if [ "$COUNT" -gt 0 ] then echo "COUNT is $COUNT" else echo "bad: Invalid Number" fi 2>&1 | grep -c bad | read bad done bad=1 while [ $bad -ne 0 ] do Seconds if [ "$INTERVAL" -gt 0 ] then echo "INTERVAL is $INTERVAL" else echo "bad: Invalid Number" fi 2>&1 | grep -c bad | read bad done echo echo "\n\n" echo "" Create_Directories echo "" echo "" #####################################################################33 #####################################################################33 #####################################################################33 #####################################################################33 #####################################################################33 echo "Starting performance script" # Write the user's output to a file, so we know what user typed in. echo "INTERVAL = $INTERVAL\nCOUNT = $COUNT" > $PDIR/$DIR/user.answers 2> /dev/null echo "VERSION = $VERSION" >> $PDIR/$DIR/user.answers 2> /dev/null echo "This script just wrote your input to a file called user.answers" # Make copies of some important system files cat /etc/system > $PDIR/$DIR/system 2> /dev/null cat /etc/vfstab > $PDIR/$DIR/vfstab 2> /dev/null cat /etc/release > $PDIR/$DIR/release 2> /dev/null cat /kernel/drv/st.conf > $PDIR/$DIR/st.conf 2> /dev/null cat /kernel/drv/sd.conf > $PDIR/$DIR/sd.conf 2> /dev/null cp -prf /var/adm/mess* $PDIR/$DIR/messages 2> /dev/null cp -prf /etc/path_to_inst $PDIR/$DIR/path_to_inst 2> /dev/null echo "" echo "This script just copied your system file, vfstab, release," echo "st.conf, sd.conf, messages, and path_to_inst\n" # Run some commands so see the output, single instance isainfo -v > $PDIR/$DIR/isainfo-v.out 2> /dev/null sysdef -i > $PDIR/$DIR/sysdef-i.out 2> /dev/null showrev -p > $PDIR/$DIR/showrev-p.out 2> /dev/null prtconf -vp > $PDIR/$DIR/prtconf-vp.out 2> /dev/null /usr/platform/sun4u/sbin/prtdiag -v > $PDIR/$DIR/prtdiag-v.out 2> /dev/null uname -a > $PDIR/$DIR/uname-a.out 2> /dev/null pkginfo -l > $PDIR/$DIR/pkginfo-l.out 2> /dev/null vxprint -th > $PDIR/$DIR/vxprint-th.out 2> /dev/null vxdisk list > $PDIR/$DIR/vxdisk-list.out 2> /dev/null metastat > $PDIR/$DIR/metastat.out 2> /dev/null ls -lisa /dev/rdsk >> $PDIR/$DIR/ls-lisa_dev_rdsk.out 2> /dev/null mount > $PDIR/$DIR/mount.out 2> /dev/null echo "" echo "Ran isainfo-v, sysdef-i, showrev-p, prtconf-vp," echo "prtdiag-v, uname-a, pkginfo-l, vxprint-th, vxdisklist," echo "metastat, ls, mount." # Launch the iostat, mpstat, and vmstat commands that don't need # to be in a loop vmstat $INTERVAL $COUNT >> $PDIR/$DIR/$VMSTAT 2> /dev/null & PID1="$!" vmstat -i $INTERVAL $COUNT >> $PDIR/$DIR/$VMSTATI 2> /dev/null & PID2="$!" vmstat -p $INTERVAL $COUNT >> $PDIR/$DIR/$VMSTATP 2> /dev/null & PID3="$!" mpstat $INTERVAL $COUNT >> $PDIR/$DIR/$MPSTAT 2> /dev/null & PID4="$!" iostat -xpn $INTERVAL $COUNT >> $PDIR/$DIR/$IOSTATXPN 2> /dev/null & PID5="$!" iostat -xtc $INTERVAL $COUNT >> $PDIR/$DIR/$IOSTAT 2> /dev/null & PID6="$!" echo "" echo "Started vmstat, iostat, and mpstat in the background." echo "\n\n\n" LOOP=1 LAST=$COUNT+1 while [ ${LOOP} -lt ${LAST} ] do # Put a timestamp in the single instance commands. date >> $PDIR/$DIR/$VMSTAT date >> $PDIR/$DIR/$VMSTATI date >> $PDIR/$DIR/$VMSTATP date >> $PDIR/$DIR/$MPSTAT date >> $PDIR/$DIR/$IOSTAT date >> $PDIR/$DIR/$IOSTATXPN # The old Berkely style ps echo "Date is `date`" >> $PS /usr/ucb/ps -auxww >> $PS sleep 2 echo "Date is `date`" >> $PS /usr/ucb/ps -auxww >> $PS echo "\n\n\n" >> $PS #The at&t ps for John Kennedy echo "\n\n" >> $PSA echo "Date is `date`" >> $PSA /usr/bin/ps -eo 'user s pri pid ppid pcpu pmem vsz rss stime time args' >> $PSA sleep 1 echo "\n\n" >> $PSA echo "Date is `date`" >> $PSA /usr/bin/ps -eo 'user s pri pid ppid pcpu pmem vsz rss stime time args' >> $PSA # sar -q echo "\n\n\n" >> $SARQ echo "Date is `date`" >> $SARQ sar -q 5 5 1>> $SARQ 2> /dev/null & #ipcs -A, to check Inter Process Communications echo "\n\n\n" >> $IPCS echo "Date is `date`" >> $IPCS ipcs -A 1>> $IPCS 2> /dev/null & # Lockstat to check for memory echo "\n\n\n" >> $LOCKSTAT echo "Date is `date`" >> $LOCKSTAT lockstat -A sleep 2 1>> $LOCKSTAT 2> /dev/null & #kmastat echo "\n\n\n" >> $KMASTAT echo "Date is `date`" >> $KMASTAT echo "kmastat" | crash >> $KMASTAT adb_calls #Network Stuff echo "\n\n\n" >> $NETSTATA echo "Date is `date`" >> $NETSTATA echo "\n\n\n" >> $NETSTATI echo "Date is `date`" >> $NETSTATI echo "\n\n\n" >> $NETSTATK echo "Date is `date`" >> $NETSTATK echo "\n\n\n" >> $NETSTATM echo "Date is `date`" >> $NETSTATM echo "\n\n\n" >> $NETSTATS echo "Date is `date`" >> $NETSTATS echo "\n\n\n" >> $NETSTATRN echo "Date is `date`" >> $NETSTATRN netstat -a >> $NETSTATA & netstat -i >> $NETSTATI & netstat -k >> $NETSTATK & netstat -rvn >> $NETSTATRN & netstat -m >> $NETSTATM & netstat -s >> $NETSTATS & echo "Date is `date`" >> $MOD 2> /dev/null modinfo | grep "SCSI" >> $MOD 2> /dev/null & echo "\n\n\n" >> $MOD 2> /dev/null echo "Date is `date`" >> $NFSSTAT nfsstat >> $$NFSSTAT echo "\n\n\n">> $NFSSTAT echo "Just finished Interval # $LOOP" echo "$LOOP + 1" | bc | read LOOP if [ ${LOOP} -lt ${LAST} ] then echo "" echo "Sleeping for $INTERVAL seconds before running next interval." sleep $INTERVAL echo "" fi done #####################################################################33 #####################################################################33 #####################################################################33 #####################################################################33 # Wait for all processes to finish echo echo wait $PID1 wait $PID2 wait $PID3 wait $PID4 wait $PID5 wait $PID6 NUM=`ps -ef | grep $PID | egrep -v "defunct|goods" | grep -v "grep $PID" | wc -l | awk '{ print $1 }'` while [ $NUM -ne 0 ] do echo echo "--- Waiting for all sub processes to end before finishing ---" echo "--- There are $NUM processes still running. ---" ps -ef | grep $PID | grep -v "defunct" | grep -v "goods" | grep -v "grep $PID" | while read line do echo "\t$line" done NUM=`ps -ef | grep $PID | grep -v "defunct" | grep -v "goods" | grep -v "grep $PID" | wc -l | awk '{ print $1 }'` sleep 5 echo done echo "--- All processes finished. ---" echo echo #####################################################################33 #####################################################################33 #####################################################################33 #####################################################################33 echo "" echo "Making a uuencoded, compressed tarball." cd $PDIR/${DIR} tar -cf ${FILE} * compress $FILE uuencode $FILE.Z $CASE.perf.tar.Z > $FILE.Z.uu chmod 777 $FILE.Z.uu #####################################################################33 #####################################################################33 #####################################################################33 #####################################################################33 echo "\n\n\n" echo "Please give the data to your support engineer." echo "The data is stored in the file $FILE.Z.uu" echo "- You can email the data or ftp it." echo "- Distinguish the data as from a busy time or idle time by adding" echo " the words busy or idle to the file name." echo "\n\n\n" echo "Instructions for emailing the data" echo "\tPut the case number, $CASE, in the subject line of the email." echo "\tSend a .uu version of the data. This is uuencoded and " echo "\tsurrives email better." echo "\n\n\n" echo "Instructions for uploading a file to SunSolve." echo "\tftp sunsolve.sun.com" echo "\t(192.18.99.148)" echo "\tlogin: anonymous" echo "\tpasswd: your email address" echo "\tcd /cores" echo "\tbinary" echo "\tput $FILE" echo "\tquit" echo "\tThen please call your support engineer or send them an email" echo "\tand notify them that the data is on sunsolve." echo "\tYou can reach your support engineer by phone by calling" echo "\t1800usa4sun, hitting 1, hitting 1, then dialing $CASE"Received on Sat Dec 1 08:16:41 2001
This archive was generated by hypermail 2.1.8 : Wed Mar 23 2016 - 16:32:36 EDT