SUMMARY: E3000 - Solaris 2.5.1 - Sybase: Which scales better, lot s of cpus or fewer faster cpus?

From: O'Neal,Chris (onealwc@agedwards.com)
Date: Mon Nov 09 1998 - 13:06:26 CST


SYNOPSIS: How while does the combo of E3000, Solaris 2.5.1, and Sybase
11.03 scaled? How does it scale best? Lots of cheap CPUs - or - a few
fast and much more expensive CPUs? Have to reduce time of nightly Sybase
jobs - What is better solution... lots of cheap CPUs or- a few expensive
ones?

HARDWARE:
Database engine.....
(1) Sun Ultra Enterprise 3000
- (2) 167mhz 512 cache CPUs
- (1) bank of 256mb RAM
- (6) Internal 2gb SCSI HD
- Solaris 2.5.1
- Sybase 11.03

Database workers....
(6) Sun SPARCstation 20
- (1) 200mhz 1mb cache ROSS CPU.
- 256mb RAM
- (2) 2gb 7200rpm SCSI HD
- SunOS 4.1.4 w/Y2K
- In-house developed c & csh programs

APPLICATION:
Nightly re-pricing of different database tables.

NOTE:
This is a large SUMMARY because there was no "the answer" but there were
lots of great responses which I think need recording in the Sun-Managers
Archives. Sometimes you can ask a question and read your desired
answer out of the responses. Both I and my co-worker caught ourselves
doing this.

POINT OF CONFUSION:
The board that the CPU and RAM goes into is called "BOARD, CPU/MEMORY"
in Sun's 1998 Spares reference Guild. I had thought this to be the
"CLOCK BOARD" but it is not. If we were to go from 167mhz to 336mhz
CPUs, not only would I need a new "BOARD, CPU/MEMORY", but I would also
need a new "CLOCK BOARD" costing an additional $2,000. This is an
additional cost that needs to be added to all 250 and 336 cpu studies
unless otherwise noted.

In my original posting and feed back to responses I had received, I
missed identified a part and ignored another needed part, I was wrong. I
was incorrectly calling the "BOARD, CPU/MEMORY" the "CLOCK BOARD" and
not recognizing that a new "CLOCK BOARD" was still needed for the latest
greatest faster CPUs. That is... You need on CLOCK BOARD per system and
one BOARD, CPU/MEMORY per two cpus.

I did not correct any of this error so as not to somehow change the
meaning of what the original arthurs were saying. Text in <<< xxxxx
>>>> is a correction note added by me after the fact.

ORIGINAL QUESTIONS:
How well does the combo (E3000, Solaris 2.5.1, Sybase 11.03) scale?
What scales better for database usage: more slower cpus or fewer faster
cpus?
What performance do you loses in overhead for each cpu you add?

SUPPORTING DOCUMENTATION:
We have a Sun 3000 w/ (2) 167mhz cpus running Sybase 11.03. We run
nightly pricing on different product types each night (previous weekly
total of pricing time = 17 hours). We are studying what it will take to
run pricing of all product types every night (get 17 hours worth of work
done in 5+ hours = 240% improvement).

Monitoring shows that some additional memory is needed by hard disk
drives and network are not that pressed. After consulting with Sybase
have broken the jobs from a single nightly SPID into multi SPIDs and
this has helped a lot (current weekly total of pricing time = 8 hours).
Memory gets tight but 3000 is not trashing (will be adding more) and
network and hard disk drives are still not pressed, but cpu is now 100%
and 80%+ each. To get 8 hours of work done in 5+ hours = 60%
improvement.

I would like to reuse the two existing 167mhz (SPECint_rate95 112 +/-)
and add another four 167mhz for a total of six (SPECint_rate95 336+/-)
costing $11,980. In theory, this a 2 fold increase, I believe this
will cover the 60% improvement I need.

One of my co-workers disagrees. He believes we need fewer but faster
processors to meet the goal. He recommends trashing existing clock
board <<<board, cpu/memory and clock board>>> and cpus and buy either
two 336mhz w/ 4mb cache (SPECint_rate95 252 +/-) for $22,500 <<< $24,500
including price for new clock board >>> or four 250mhz w/ 4mb cache
(SPECint_rate95 370) for $32,600 <<< $34,600 including price for new
clock board >>>.

So, what do other 3000 admins think? What are your experiences with
databases utilizing multiple cpus?

ANSWER:
Wide ranging of responses with great details but no "the answer". I
should have expected this since I did not explain in detail how the
database is built, how it is used, and what are the nightly processes
doing.

Ultimate decision went through three cycles: First add (4) more 167mhz
cpus to give use a total of (6) - one for OS five for Sybase. Second,
{and then Solaris 7 (64bit) was released} remove existing (2) 167mhz and
replace with (2) 336mhz cpus. Third and final answer, do nothing with
cpus just add RAM.

Since this E3000 is scheduled for replacement in one too two years and
yesterday we were not planning to goto Solaris 7 on this box I think the
(6) 167mhz would work great. Now we might goto Solaris 7 (Sybase
dependent) and this box now may not be replaced. So we are just adding
RAM at this time and save our money.

The option I was favoring was 6x167, the option my co-worker was
favoring was 2x336. I believe that 6x167 would get us there if the E3000
and Solaris 2.5.1 and Sybase 11.03 scaled while. And that was my main
question "How while does this combo scale?". My general take on the
responses is "it scales quite while".

THANK YOU:
Thank you(s) and response(s) are list in order received. I would like
to thank everyone that offered their help. Most particularly I would
like to thank Peter Polasek for his time in responding in such technical
detail. It was a great help.

Martin <martin@stavanger.geoquest.stb.com>
Mark Mellman <mellman@surfcity.ne.mediaone.net>
Bill L. Sherman <Bill.Sherman@bridge.bellsouth.com>
Peter L. Wargo <plw@ncgr.org>
Peter Polasek <pete@cobra.brass.com>
Rick Niziak <rniziak@kappys.med.iacnet.com>
Rose Robert <Robert.Rose@ag.gov.au>
Chuck <seeger@cise.ufl.edu>
Marc <MARC.NEWMAN@chase.com>
Carlo Tosti <carlot@interlog.com>
Rogerio Rocha <rogerio_rocha@bvl.pt>

POSTED:
Chris O'Neal <coneal@agedwards.com> - 11/06/1998

RESPONSES:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Martin <martin@stavanger.geoquest.stb.com> wrote:

Please, be aware of that the 167MHz comes in two versions:
  - 167mhz w/ 0.5mb cache
  - 167mhz w/ 1.0mb cache
The 250MHz also comes in two versions:
  - 250mhz w/ 1.0mb cache
  - 250mhz w/ 4.0mb cache

In addition you need to upgrade your mainboard(s) from model X2600 to
model X2601 when you use CPUs with 4mb cache (UGSB2600-2601-A).

<<<I could find no part named "mainboard" in spares. Martin may be
talking about the "board, cpu/memory" or the "clock board", they are to
different things (original I thought they were same, I was wrong)>>>

They claim that databases benefits from large fast CPU cache so from
that point of view I would choose the 336mhz CPUs which
will also give you the option of adding more 336mhz CPUs in the future
if needed. Did you include the cost of extra mainboards in your
6*167mhz example ?

<<<If here "mainboards" means "boards, cpu/memory" then yes this cost
was included in original $ for 167. No new clock-board is needed if we
stay with 167. What was not included was the extra $2,000 needed for the
new "clock-board" needed by either 250mhz or 336mhz cpus.>>>

When programs are not multithreaded then 2*336 will be far better than
4*167 even if they give you fairly similar SPECint_rate95.

<<<I was looking at 6*167 and they give lots more SPECint_rate95 than
2*336. I am told that Sybase 11 is multithreaded and that Sybase 11.05
is the best for performance. Sybase 11.09 has "cell" level locking and
some almost near future version of Sybase will combine 11.05 and 11.09
into one release. We are studying if we should move from 11.03 to 11.05
now or wait for this almost near future release. >>>

Martin <martin@stavanger.geoquest.stb.com> in his second response wrote:

The E3000 have three main types of boards :

clock-board : thin ~1" card with keyboard and two serial ports
              All systems shipped before ~Q1-97 had the old
              clock-board. You must upgrade the clock-board
              if you want to run CPUs faster than 167MHz.
CPU-memory board : can take up to 2 CPUs and up to 2 memory groups
              The first version of the CPU-memory board, X2600
              (part# 501-2976) can not utilize all fast cache on CPUs
              with more than 1mb fast cache. The newer X2601 card
              will support 4mb fast cache CPUs. There is a CPU-memory
              board upgrade option UGSB2600-2601-A.
I/O bards : SBus, PCI and Graphics I/O boards.

Assuming you don't have spare parts laying around you need
the following to upgrade from 2 to 6 167MHz CPUs :
   - one power supply module (to go above the two CPU-memory board
slots)
     unless you already have 2 or 3 power supply modules
   - two CPU-memory boards
   - four 167MHz CPUs (preferable only with 1.0mb fast cache)
   - memory, if you have 512mb memory today ? (two 256mb memory groups)
     then I do recommend adding 512mb memory on both extra memory
     boards for a total of 1536mb. If you have 256mb today (one 256mb
group)
     then I recommend to add one 256mb memory group on each of the
     new/extra CPU-memory boards for a total of 768mb memory.
       - It's recommended to balance/spread memory as equally as
possible
         on CPU-memory boards.
       - In addition it's not recommended (may not even work) to use
         two memory groups of different size on one CPU-memory board.
       - a system with two equally sized memory groups on each
CPU-memory
         board will be faster than a system one memory group on each
         CPU-memory board due to memory interleaving.
       - on a 6 CPU system it's preferable to have 6 equal size memory
         groups like : 6*64mb, 6*256mb or 6*1024mb (My opionion is
         that 64mb memory groups are a "big" joke and waste of time
         on an expensive E3000 with the current "low" memory prices).
     On http://www.memoryx.com/sun.htm they have
     new Sun 256mb memory kits for just under $800 (1024mb total <$3200)
     (assuming you have 512mb and are going for 1536mb memory).

You should not need the clock-board upgrade if you continue with
167MHz CPUs only.

An E3000 will scale nearly linear when you add CPUs which is
something most/all systems with Intel CPUs will not.

Example of our E3000 that should have had 2048mb memory (instead of
512mb)
at least a year ago (please notice the 2-way memory Interleave Factor)
:-(

> System Configuration: Sun Microsystems sun4u 4-slot Ultra Enterprise
3000
> System clock frequency: 82 MHz
> Memory size: 512Mb
> CPU Units: Frequency Cache-Size Version
> A: MHz MB Impl. Mask B: MHz MB Impl. Mask
> ---------- ----- ---- ---------- ----- ----
> Board 3: 248 1.0 11 1.1 248 1.0 11 1.1
> Memory Units: Size, Interleave Factor, Interleave With
> 0: MB Factor: With: 1: MB Factor: With:
> ----- ------- ----- ----- ------- -----
> Board 3: 256 2-way A 256 2-way A

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Mark Mellman <mellman@surfcity.ne.mediaone.net> wrote:

It totally depends on your application. If the queries you run are
short and none cpu intensive (no multi-table joins or any other complex
queries), then more slower processors is the answer. If the queries are
complex and long running, then fewer faster processors is the answer.

In any case, we have had good scaling performance with Sybase 11.X (not
before). A colleague of mine is giving a presentation at USENIX about
right sizing DB servers. It may be an interesting topic for you.

Mark Mellman <mellman@surfcity.ne.mediaone.net> in his second responses
wrote:

Your co-worker could be right with some applications that are not tuned
to utilize SMP architectures. I know that Sybase 11.X is. I have
actually done some performance testing to verify. I agree with your
config, 1 OS processor and 5 Sybase engines.

I live in N. Andover and work in Cambridge. Sucky commute, so I work
from home as often as possible. The talk is at USENIX in Boston.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Bill L. Sherman <Bill.Sherman@bridge.bellsouth.com> wrote:

A lot of it deals with how much of the run HAS to be done sequentially
and how much CAN be done in parallel. If all of the SPID can run
parallel, then spreading the load over more CPU's will probably help.

Some of it depends on the business demand driving the requirement. If
the business is pushing for the 5 hour window, your best bet is to spend
$32000, get the clock upgrade, etc because you could still install 2
more CPU to get a little better boost down the road. With the lower
cost of 4 additional cpu, you throw out everything if it doesn't work.

Make sure you don't have a bottleneck somewhere. If your CPU is 100%
utilized, there is likely a bottleneck somewhere. Check for wait time
under sar and if there is more than 10%, start looking at the disks, I/O
channels, log files, etc for hold times. You may be overrunning a SCSI
channel and could get some significant boost by spreading the load out
and running more async i/o.

Also might look into the logging features. If the updates and scratch
work is TOTALLY and EASILY rebuildable. ie: redone on each run, you
MIGHT be able to get away with turning off logging during this run. That
will save a bunch of database time.
I'd also think that between Sun and Sybase, they should be able to
provide some evaluation equipment to run your tests on.

Bill L. Sherman <Bill.Sherman@bridge.bellsouth.com> in his second
response wrote:

You basically get the same SMP overhead at 2 processors that you do at
6. There is not a linear decline and certainly no point where removing
a cpu gets better performance. Your Sun rep or the Sun website should
be able to provide numbers to show relative performance as each CPU
module is added to the system.

Sybase will take a lot of advantage of the multiple processors, just
make sure that you don't BIND a SPID to a specific processor, let
Solaris take care of the process scheduling.

I'll take your word that you are not I/O bound and get off of that
soapbox.

You know, even if you are working with a third party vendor, you can
probably write it into the contract that if certain performance goals
are not met, they will provide you full credit for the 167s towards the
purchase of the 336s. After all, it means you will be putting more
money into their pocket! You might even get them to take the existing
167s and clock board as additional trade.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Peter L. Wargo <plw@ncgr.org> wrote:

You don't mention what dives you are using, how many SCSI buses, etc...
<<< Sorry... Hope additional info provided in this summary closes those
holes. >>>

We run sybase on a variety of systems, and the rule of thumb has been
more than four engines (cpu's) is no good under anything below 11.5.
MEMORY helps a great deal.

We're getting reasonably good performance on a 4-CPU (250MHz, 1M cache)
E4000 with two SSA 114's. It has 1G of RAM.

Peter L. Wargo <plw@ncgr.org> in his second response wrote:

>What my co-worker believes is that the overhead of managing six cpus,
as
>oppose to just two, will eat up the additional speed of six.
>
>I don't share this believe because SUN sells servers with 20+ 167mhz
>cpus. The nature conclusion of his thinking is that to make a machine
>faster you would remove cpus. I understand that there is "overhead" in
>a multi-cpu machine but I believe its impact on SUN boxes is less than
>the performance gained.

Hardly. We have a 10-CPU (10x250MHz) E5000 kicking around, and it runs
fine - the UE arch. is very good at mp, with pretty low latency. (We
also have a 64x333MHz E10000 w/64G of RAM, that runs *really* well...)

>Both "sar" and sybase monitoring show no bottle necks in hard disk
>drives or network. I am running short on memory now that I have upped
>it to 10 SPIDs. Ten SS20 do the processing. One for each SPID. They
>are less the 20% pushed while doing so. The E3000 feeding these SPIDs
>does chock on CPU, 100% and 80%.
<<< We now are using six (6) SS20 each with multi-SPID(s) and things
seem to be going while>>>

There are some fun tools to check even deeper - one is proctool, the
other is perfbar. Let me know if you need copies.

>I am buying the four additional cpus from a third party. So can't get
>loaner equipment for evaluation purposes.

Hmmm... SOlar Systems?

>Our E3000 has six 2gb 7,200 rpm hard disk drives. I am proposing the
>purchasing of one additional 2gb to increase OS swap and sybase temp
>tables.
<<<On our setup it seems Sybase eats-up the same amount of swap space
that you assign it RAM. That is if I add another 256mb or RAM and give
that to Sybase, I would first have to increase my OS swap space by
256mb>>>

I assume you're using the internal drives. <<< Yes >>> You might want to
consider getting a SunSwift (FW SCSI/100mbit ethernet), and empty drive
case, and moving half your dives to another SCSI bus. You'd be
amazed....

>Our E3000 also has 256mb of RAM. I am proposing the purchase of
another
>(2) banks of 256mb --- one 256mb bank for each clock-board << board,
>cpu/memory>>> and give 160mb to OS the rest to Sybase.

Now, you realize that you'll need CPU boards (which is where the memory
goes, not the clock board) for your CPUs. I would suggest just getting
one bank of 1G (you can get it for $4400 or so these days) - dedicate
the 1G to sybase, leave the rest for OS/etc. It'll boogie.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Peter Polasek <pete@cobra.brass.com> wrote:

I think this is a 'no brainer'. The 167Mhz CPU's are pretty much
obsolete because of the slow processing rate and small cache size.
Although expanding the 167's is the most cost effective approach, it
severely limits your future growth capabilities. I would take the hit
and upgrade to two 336Mhz CPU's. It is impossible to predict whether
this will be enough horsepower for your application; however, if it is
not, you will be much better positioned to increase from two fast CPU's
than you are from a half dozen slow ones. Two other factors to consider
is that processing scalability decreases with more CPU's. The second is
that most benchmarks do not reflect the benefit of large cache sizes
because they do not span large areas of memory. For example, a
336Mhz/512Kb CPU (if it existed) would likely show benchmark figures
similar to those of the 336Mhz/4Mb cache model; however, the real-world
performance of the larger cache CPU in memory-intensive applications is
significant (fetches from main memory take forever in 336Mhz clock
ticks). The 336 chips have a cache size that is 8 times larger than the
167's (I believe that the 3000 167 CPU's have a 512Kb cache).

I would only take the 167 approach if the budget is extremely tight and
the machine is slated for obsolescence in the reasonably near future.
<<< At first I was told it was, but when I went back a second time,
explained that I needed a definitive date or event schedule for E3000
replacement and explained why... I could get no commitments>>> If the
latter is true, expanding the 167's may be the way to go. If the budget
is really tight, then you can probably pick up refurbished 167
processors for next to nothing. You may want talk to Sterling Deeb at
Marco International ((303)449-9616) if you want to take this route
because he's pretty effective at finding items like this (he can also
quote new Sun parts).

Sybase recommends that you run one less data engine than you have CPU's
(to reserve one for the O/S). I think this is true for large numbers of
CPU's, but if the machine is a dedicated Sybase server (not running
other CPU-intensive apps), Sybase performance is definitely better
running 2 data engines on a dual-CPU box.

Peter Polasek <pete@cobra.brass.com> in his second response wrote:

> Currently we have two 167 w/ 512k cache. SUN said I can mix and match
cpus of
> same speed but with different cache in same machine as long as they
were
> installed in the correct clock boards <<<boards, cpu/memory>>>.
> So the four new 167 will have 1mb cache and will work with the two
existing
> 167 with 512k no problems.
>
> We went from 1 SPID to 10 SPID and dropped a 12 hour nightly process
to
> 3 hours (but buried the two existing CPUs). My current thinking is
that
> six 167 will be more than enough to get a single weeks work of
database
> processing down to one night.
>
> Cost includes cpus and clock-boards.
<<<boards, cpu/memory not clock-board. The 167 option will not need a
new clock-board but the 336 option will. Thats an additional $2,000 not
figured in original posting>>>
>
> I am looking at:
>
> six 167mhz = 336 SPECint_95 costing $11,980
> two 336 mhz = 252 SPECint_95 costing $22,500
<<< $24,500 including price for new needed clock-board>>>
>
> The E3000 will be replaced in less than two years with something
bigger
> as part of another project. My thinking is with 6 cpus I could give
one
> to the OS and the other 5 to the Sybase. This would give me 5 clean
> cpus for Sybase, I feel this is better than 2 cpus which must be
shared
> between OS and database.

If the machine will be tossed in < 2 years it may make sense; otherwise
your backing yourself into a corner because you will be forever limited
to 167Mhz processors (you may find yourself in that corner very soon if
the additional CPU's don't solve the problem).

> Then there is cost. For two 336 I spend twice as much over six 167
and
> get less SPECint_95.
>
> What my co-worker believes is that the overhead of managing six cpus,
as
> oppose to just two, will eat up the additional speed of six.
>
> I don't share this believe because SUN sells servers with 20+ 167mhz
> cpus. The nature conclusion of his thinking is that to make a machine
> faster you would remove cpus. I understand that there is "overhead"
in
> a multi-cpu machine but I believe its impact on SUN boxes is less than
> the performance gained.

The Sun will scale well, but the effectiveness of additional processors
is 100 percent application dependent. For example, if you run a single
non-threaded process on your machine, it will run in the same time
whether you have 1 or 8 CPU's. A single process must be intelligently
multi-threaded to take advantage of multiple CPU's. Sybase scales
fairly well if you have a large number of active users, but I'm not sure
how effective it is with a single CPU-intensive batch process. One
limitation is I know of is that, even if you run multiple dataserver
engines, all communication to the client processes is done through the
primary engine.

> Both "sar" and sybase monitoring show no bottle necks in hard disk
> drives or network. I am running short on memory now that I have upped
> it to 10 SPIDs. Ten SS20 do the processing. One for each SPID. They
> are less the 20% pushed while doing so. The E3000 feeding these SPIDs
> does chock on CPU, 100% and 80%.

You have to be careful interpreting the tradition UNIX statistics for
the Sybase server machine because the Sybase spinlock processing
distorts the CPU utilization (i.e. makes the vmstat CPU busy times much
higher than they actually are). It's very likely that it is using lots
of CPU (because Sybase has a voracious appetite for clock ticks), but
your best bet is to collect 'sp_sysmon 1' results throughout the process
(you may want to pick a longer interval because of the length of your
process). This will give you a very accurate picture of what's going
on, but it takes a little bit of work to interpret the results if you're
not familiar with the output. The first value that should be of
interest to you is:

Engine Busy Utilization:
Engine 0 59.2 %
Engine 1 57.0 %
----------- --------------- ----------------
Summary: Total: 116.2 % Average: 58.1 %

This shows the real CPU processing load (without the spinlocks
overhead). Sybase has a document somewhere on the website to help
interpret these stats. In theory, you could send the sp_sysmon stats to
a back-end sybase support engineer for interpretation (if you have not
done so already), but rarely will you get insightful responses.

I am including an excerpt from a document I wrote about spinlock
processing if you are not familiar with the overhead:

+++++++++++++++
Sybase uses "spinlocks" to artificially cling to processes waiting on
disk I/O. This is useful in multiprocessor applications where it is not
desirable to toggle processes between CPU modules (because the
associated data may not be resident in the alternate CPU's
cache/registers and because of the overhead associated with process
context switching). Spinlock processing consumes CPU resources at the
UNIX level without getting any useful work done at the Sybase level. As
a result, the spinlock processing exaggerates the CPU usage reporting
for traditional UNIX performance tools (vmunix, iostat) because UNIX can
not differentiate between spinlock processing and useful work.
 
By default, Sybase is tuned for multi-processor environments (despite
the fact the neither Sybase 4.9.2 or the SunOS operating system supports
multiple processors). The spinlock processing is controlled by the
"cschedspins" parameter in the master database. The default
"cschedspins" value is 2000; however, the optimal value for a single
process configuration is "1". The effect of modifying this parameter is
statistically significant in that the UNIX statistics accurately report
CPU utilization rather than indicating CPU overload even under moderate
Sybase loading. The performance improvement is less concrete. The
performance improvement should be significant in "shared" Sybase server
environments because the freed CPU time can be allocated to competing
processes. The improvement in dedicated Sybase environments is more
ambiguous (Sybase is non-committal). There is at least a small chance
for reduced performance because of the increased occurrence of context
switching in the new configuration.
+++++++++++++

The sp_configure 'runnable process search count' is the modern
equivalent of the 4.9.2 'cschedspins' parameter. The recommended value
for multi-CPU environments is 2000 (this is also the default). The
value can be set through 'sp_configure'.

I would not be concerned about reserving a CPU for the OS if you choose
the fast processor route because, unless you are using the server other
applications, the OS does not do much.
Even an intensive SQR program uses very little OS cpu time (most of the
work is done in the Sybase server). With 6 CPU's it probably makes
sense to reserve (you should try both - we did this with the two cpu
case and there was no comparison - using two data engines is MUCH
better).

> I am buying the four additional cpus from a third party. So can't get
> loaner equipment for evaluation purposes.
>
> Our E3000 has six 2gb 7,200 rpm hard disk drives. I am proposing the
> purchasing of one additional 2gb to increase OS swap and sybase temp
> tables.

Increasing swap will raise the total memory capacity of the machine, but
it will not help performance and will possibly hamper it. Increasing
the Sybase 'total memory' parameter can increase performance
significantly; however, it is only beneficial if you have enough
physical memory for the memory size ('total memory' of 50000 uses 100Mb
of memory). If you configure Sybase to use 500Mb of memory and you have
256Mb of physical memory, then performance will SEVERELY degraded
because the system will be endlessly paging. There is a bug in pre 2.6
versions of Solaris that causes the system to use twice as much memory
as defined (it uses only the correct amount, but it takes double the
memory size from swap). The work around for this bug can is to disable
intimate shared memory in the Solaris machine, but this will reduce
performance. Sun fixed this in 2.6 but will not create a patch for
earlier revs.

<<< I was unaware of this bug in Solaris 2.5.1. What is happening now
is however much RAM we give to Sybase it eats an equal amount from /tmp
(swap). It does not like it if we give it more RAM than we have swap.
My plan now is to add another 256mb of RAM and another 2gb HD. I will
give either 96mb or 160mb of RAM to OS and the remainder to Sybase. The
new HD is needed to increase swap. The formula I am considering using
for setting swap is (OS RAM + Temp file space + 2(Sybase RAM)). I will
give remainder of drive to "tempdb".>>>

Increasing 'tempdb' size will not improve performance. Expansion is
only needed if you run out of space. If tempdb is running out and is
fairly large, it is usually indicative of poorly structured queries
(with tons of 'order by' or 'group by' clauses) is very likely a
bottleneck to the system. You may want to look at the offending queries
to see if these can be optimized.

<<< tempdb does fill-up from time to time during production day. Not at
night during nightly processes. Its filling could be related to bad
programming of queries but I don't know and this is not an area of my
control. I will pass comment up the line. Its filling occurs when many
users are doing hard work all at the same time. Filling causes problems
during work day and the easiest answer seems to be increasing it size.
>>>

There are some tricks to get higher performance out of tempdb. Because
all tempdb is throwaway data, you can take some liberties with the
storage medium. Sybase raw disk I/O is robust (because it immediately
writes updates to disk) but slow. Output to flat files is much faster
because the OS buffers the data and bulk writes it to disk at fixed
intervals. Therefore, you can cheaply and safely improve tempdb
performance by using a flat file rather than the raw partition. You can
take this a step farther by using a 'tmpfs' file system for tempdb (but
you need enough physical memory for the files or performance will be
impaired rather than improved). This is like a free RAM disk (except
that you need physical memory for the file size). The last step is to
use a RAM disk for tempdb. The price for these has dropped
significantly, but is still high (~$20/Mb). Sybase automatically places
the first 2Mb of tempdb on the master device, so you need to either
created a 'brick' table to fill this space (causing all tempdb I/O to
use the high-speed device) or remove it with system level commands.
Sybase has a document that describes the options for this.

> Our E3000 also has 256mb of RAM. I am proposing the purchase of
another
> (2) banks of 256mb --- one 256mb bank for each clock-board and give
> 160mb to OS the rest to Sybase.

256Mb sounds light for the processing load - of course, it all depends
on what your doing. Be very sure that your Sybase 'total memory' value
is less than '100000' (200Mb) with the current memory or you will have
terrible performance due to paging. If you are doing lots of queries
off of a single large table (or lots of tables), the additional memory
will help a lot. You will notice that initial queries from a large
table take a long time and subsequent queries are very fast because the
data is stored in the buffer cache. This can not occur if you don't
have enough memory defined.

> I am not aware of any 336 w/ 1mb cache. But SUN does publish the
> difference between 250mhz w/ 1mb cache and 250 w/ 4mb cache. They
are:
>
> (1) 250 w/1mb = 84 SPECint_rate95
> (6) 260 w/1mb = 505 SPECint_rate95
>
> (1) 250 w/4mb = 93.9 SPECint_rate95 (12% improvement)
> (6) 250 w/4mb = 556 SPECint_rate95 (10% improvement)

This is an interesting spec I haven't seen before. The improvement
depends on the application. The benchmark program I wrote for burning
in our production system runs about the same in a U10-300 (512Kb cache)
and a U2-1300 (2Mb cache) because it does not span large amounts of
memory. This program is a terrible benchmark but is pretty good for
beating up the machine. The SPECint benchmark is a million times more
sophisticated and benefits somewhat - but Sybase is very memory
intensive and I suspect it may benefit more than this spec shows
(unfortunately, Sun does not have utility to show the cache hit ratio).

> Doing the more expensive upgrade now may derail the scheduled E3000
> replacement. I think that a year from now SUN would have faster hard
> disk drives, more memory and faster back-planes in their servers.
> Better than what six 336 w/ 4mb in an E3000 could do.

Aha, the real agenda appears. This is probably true.

One last, and perhaps the most important, note. Sybase performance is
most often completely dominated by indexes. If an index is not used, it
can easily cause a query to take 100 times (or more) longer because a
table scan reads the entire table. I would modify the program to print
time stamps at every possible step to allow you to focus on the
bottleneck queries, then use showplans to confirm that indexes are being
used. Use 'set statistics io on' so you can see the physical and
logical disk io's for every query. If you see lots of physical disk
io's then you probably do not have enough memory. If you see lots of
logical disk io's or (in addition to physical), then you are not using
an index when you should be. The sybase 11 optimizer is terrible at
selecting indexes. I have seen several queries that clearly should
select an index and Sybase does a table scan anyway. This occurs for
sparsely populated entities within a large table because the optimizers
computed selectivity goes to 0.00000 (it does not have enough
precision). This did not exist in 4.x Sybase. I have given up hope of
sybase fixing the problem persisted through several EBF's. We worked
around the bug by forcing the index in
the offending query. This is not an elegant solution but is really
needed when this conditions occurs. All the CPU's and memory in the
world will not help a poorly indexed query.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Rick Niziak <rniziak@kappys.med.iacnet.com> wrote:

I would have to say less, faster cpu's, because the way that Solaris is
written, It will not take advantage of additional CPUS until previous
CPUS have reached a certain threshold (~85-90% utilization).

What I mean is, if you have 5 CPUS and configure Sybase to run on 4 of
them, the OS really won't use the 4th or 5th until the 2nd and 3rd are
pretty much at the max..

<<< I would concur with Rick's point of view for SunOS 4.1.3 thru
Solaris 1.1.2 (from there to 2.5, I don't know) but Solaris 2.5.1 for us
seems to being doing a little better than what Rick reports. I am was
wanting to have 6 cpus - one for OS, five for Sybase. The understanding
I am getting from other responses is that this should work while if I
have the RAM and HD I/O to match. >>>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Rose Robert <Robert.Rose@ag.gov.au> wrote:

In my last job, we did some benchmarks with Oracle on an Ultra 6000 with
8 cpu's. I 'turned off' 6 cpus using psradm and we ran a 50 user
simulation creating a large number of records in an oracle database, we
then re-ran the simulation with 4, 6, and 8 cpus, we found that Oracle
scaled very well up to at least 8 cpus. I don't have any figures from
the tests but we were looking at something like 1hr for 4 cpus, 45
minutes for 6 and 30 minutes for 8. These tests were done with 167MHz
cpus and unfortunately we didn't have the chance to re-run them on the
250MHz cpus on the same machine. The simulation was very cpu and disk
intensive (the same thing ran on a Sparc 10/52 with ODS doing raid5 and
had to be killed, since it didn't finish after 3 days!)

My impression is that if Sybase scales well like Oracle does, go for
more of the slower cpus, if not, go for fewer faster cpus. If you're
really worried, why not give your local Sun rep a call and ask if you
can borrow a machine for a couple of weeks to test with? My experience
with Sun here in Australia is that they are very keen to help out,
particularly if they're going to make a sale out of it.

<<< I had made arrangements with the VAR about getting in test
equipment. They wanted insurance that after the testing was done we
were going to buy either 6 167 or 2 336. Before I could present this to
my manager he canceled the project. >>>

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Chuck <seeger@cise.ufl.edu> wrote:

Chris, I haven't worked with Sybase since early versions of System 10
over three years ago. At that time Sybase scaled poorly on multiple
CPUs compared to Oracle and basically couldn't make effective use of
more than four processors. With that background, I would tend to agree
with your coworkers of going with fewer but faster processors. However,
you should confirm that with admins experienced with Sybase 11.x.
Things may have changed a lot in that time.

<<< It appears Sybase has gotten better with Sybase 11.05 being the best
for multiple CPU performance >>>

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Marc <MARC.NEWMAN@chase.com> wrote:

I vote for more, faster processors. Do a technology refresh and get the
CPUs up to 300mhz, add 2 more, and you will be a happy camper with room
for 2 more CPUs.

<<< Marc if I thought I could do that... I would and had not bother to
ask the question of the group :). But some data on this option is:

+++++++++++
After studying the following chart below, it is recommended that we
upgrade to (3) boards-cpu/memory each having (2) 167mhz - (8x32mb) RAM
for a total of 6 cpus with 768mb RAM, (7) 2gb hard disks, for a total
expenditures = $14,780

Performance table:
CPU: 167mhz w/1mb 250mhz w/4mb 336mhz w/4mb
SPECint_rate95: (1) 59 93.9 134
SPECint_rate95: (6) 336 556 767
Loss factor: 0.949 0.986 0.939

Loss Factor = SPECint_rate95 (6 cpu)/ 6 * SPECint_rate95 (1 cpu)

Dual CPU - Cost / Performance table:
CPU: 167mhz w/1mb 250mhz w/4mb 336mhz w/4mb
Factor_rate95: (2) 112 185 252
Cost: $0 $18,300 $24,500
Cost Ratio: $0 $99 per SPEC $97 per SPEC

Quad CPU - Cost / Performance table:
CPU: 167mhz w/1mb 250mhz w/4mb 336mhz w/4mb
Factor_rate95: (4) 223 370 503
Cost: $5,990 $34,600 $47,000
Cost ratio: $27 per SPEC $93 per SPEC $93 per SPEC

Sextuplet CPU - Cost / Performance table:
CPU: 167mhz w/1mb 250mhz w/4mb 336mhz w/4mb
SPECint_rate95: (6) 336 556 767
Cost: $11,980 $50,900 $69,500
Cost ratio: $37 per SPEC $91 per SPEC $90 per SPEC

Note: Higher value for SPECint_rate95 rating indicates greater
performance level.

Staying with 167mhz cpus allows AGE to configure a 6 cpus server by
purchasing just 4 cpus and 2 cpu/memory boards. This would give us a
SPECint_rate95 of 336 for $11,980 equaling $37 per SPEC. If projection
calculations are correct, this would give us a clear 2 fold increase
over existing performance ( (336 - 112 = 224)/112 = 2 ).

Each clock board has (2) banks of eight memory slots able to hold 8, 32,
or 128mb units each. Giving each of the (2) new clock boards 8x32mb
would bring total server memory up to 768mb for $2,400.

Increasing RAM for Sybase will cause a needed increase of swap space in
/tmp equal to the amount of RAM given to Sybase. A seventh 2gb 7200rpm
hard drive would cost $400.
+++++++++++++

The above price study has been updated to reflect addition cost of
CLOCK-BOARD for 250 and 336 cpus.

It now sounds like for sizing swap I need to double it times RAM given
to Sybase plus whatever I want for OS and temp files. >>>
 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Carlo Tosti <carlot@interlog.com> wrote:

If the amount of work for your 3000 will not increase much then I will
go for the most economical solution ( by your quotations ~12K). Should
the work-load increase more than ~40%/year then is worth to consider the
board upgrade to support the fastest clock rate available.

Be warned that your disks may feel the "pressure" of the CPU power if
Sybase and Solaris parameters are tuned properly and the appropriate
amount of memory is given to Sybase.

Solaris scales very well using the Exxxx architecture.

<<< Carlo, what I would have like to have done is gone with the six 167
with 3gb RAM. Give 1/4 gb to OS, 3/4 of gb to Sybase and put 2gb into a
"tmpfs" for "tempdb" per Peter Polasek. When this machine gets to
slow... send it offsite as our HOT backup system and replace with
new-latest-greatest Sun. But cost are an issue >>>

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Rogerio Rocha <rogerio_rocha@bvl.pt> wrote:

Our setup is quite similar, Sybase 11, E3000, 2 x CPU 160Mhz.

Sybase was inicially configured for one engine, one CPU. Later we
configured it to use up to 2 CPU's and performance did improve. Allow me
to disagree with your
 
" add another four .....
....this a 2 fold increase "

because with that you get more 180% (less then 20% overhead).

To get 240% more (aim for 320%) either have to use more 6x167MHz or
4x336MHz.

<<< The option I was favoring was 6x167, the option my co-worker was
favoring was 2x336. I believe that 6x167 would get us there if the E3000
and Solaris 2.5.1 and Sybase 11.03 scaled while. And that was my main
question "How while does this combo scale?". I think that the answer is
"it scales while".>>>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:12:52 CDT