Performance issues - recommendations?

Performance issues - recommendations?

Post by Michael Ru » Tue, 17 Dec 1996 04:00:00



Hi,

   I am in need of some input regarding just what I should consider upgrading
on one of our Alpha systems.

   Our current configuration consists of a cluster of:

2 - 2000 4/233 Alpha's  each 1 gig memory
1 - 2100 4/233 Alpha    512 mb memory
2 - 1000 4/200 Alpha's used as disk servers   128 mb memory each
Fully redundant HS121 disk controller with 20  4.3 gig disks in an SW800

(The 2 - 2000 Alpha's run quite smoothly as they are currently configured.)

   One of the questions is what would show us the biggest improvement? The
purchase of an addition CPU for the 2100? Additional memory for the the 2100
and/or the 2 disk servers?

   Currently the CPU usage is running about 80% on the 2100 with memory
averaging approx. 75%. Will adding another CPU help release memory or will
an additional CPU need to be installed in conjunction of addition memory?

   The 2 servers are being pelted by heavy CPU usage due to the running of
defragmentaion processes on them. Memory seems to be sufficent but will
increasing the amount of memory on these servers help free-up the CPU's?  

   As you can tell, I am grasping at straws. I am just not sure what I need...
Please excuse my stupidity on this topic. I am not sure what the difference
the results would be if I added addition CPU's with or without additional
memory (or vise-a-versa)? I don't want to waste our money on items that are
not needed..

   What is a good rule to follow to know what is needed to be upgraded and/or
when to upgrade CPU's or Memory? Perhaps there isn't a rule....

    Any advice given will be much appreciated.....

                                                Thank You in Advance,

                                                Mike Ruch
                                                Systems Programmer
                                                Saint Louis University
                                                St. Louis, MO


                                                Checkout our homepage!        
                                                  http://www.slu.edu

 
 
 

Performance issues - recommendations?

Post by Richard B. Gilber » Tue, 17 Dec 1996 04:00:00



Quote:>   I am in need of some input regarding just what I should consider upgrading
>on one of our Alpha systems.

>   Our current configuration consists of a cluster of:

>2 - 2000 4/233 Alpha's  each 1 gig memory
>1 - 2100 4/233 Alpha    512 mb memory
>2 - 1000 4/200 Alpha's used as disk servers   128 mb memory each
>Fully redundant HS121 disk controller with 20  4.3 gig disks in an SW800

>(The 2 - 2000 Alpha's run quite smoothly as they are currently configured.)

>   One of the questions is what would show us the biggest improvement? The
>purchase of an addition CPU for the 2100? Additional memory for the the 2100
>and/or the 2 disk servers?

>   Currently the CPU usage is running about 80% on the 2100 with memory
>averaging approx. 75%. Will adding another CPU help release memory or will
>an additional CPU need to be installed in conjunction of addition memory?

>   The 2 servers are being pelted by heavy CPU usage due to the running of
>defragmentaion processes on them. Memory seems to be sufficent but will
>increasing the amount of memory on these servers help free-up the CPU's?  

>   As you can tell, I am grasping at straws. I am just not sure what I need...
>Please excuse my stupidity on this topic. I am not sure what the difference
>the results would be if I added addition CPU's with or without additional
>memory (or vise-a-versa)? I don't want to waste our money on items that are
>not needed..

>   What is a good rule to follow to know what is needed to be upgraded and/or
>when to upgrade CPU's or Memory? Perhaps there isn't a rule....

>    Any advice given will be much appreciated.....

        First, get and read "Guide to VMS Performance Management" (or
similar title, depending on version of the Doc Set).

        Second,  how much paging and swapping is going on on the 2100?
You seldom run out of memory on a virtual memory system; if your page
files are big enough, you never will.  The system just runs slower and
slower as disk is increasingly substituted for physical memory.  A
certain amount of paging can not be escaped because executable images
and shared libraries must be paged in when the image is activated.  A
certain amount above the inescapable paging is reasonable; it's absurdly
expensive to accomodate *everything* in RAM.  If heavy paging is going on,
adding memory will speed things up.

        Third, look at your disk I/O.  If you are seeing queues of
requests your disks are slowing you down.  It may be posssible to buy
faster disks but a better approach might be to look at a RAID array with
heavy caching.

        If you have licenses for Polycenter Peformance Data Collector
and Performance Advisor, install the software and learn to use it.  It
will tell you a lot about a system exhibiting performance problems.
If you are rich, buy the licenses; you need a Data Collector for
each system but only one Performance Advisor.

--
*************************************************************************
*                        Here, there be dragons!                        *

*                                                                       *
*                                                Richard B. Gilbert     *
*                                           Computer Systems Consultant *
*************************************************************************

 
 
 

Performance issues - recommendations?

Post by Jim Becke » Tue, 17 Dec 1996 04:00:00



> Hi,

>    I am in need of some input regarding just what I should consider upgrading
> on one of our Alpha systems.

>    Our current configuration consists of a cluster of:

> 2 - 2000 4/233 Alpha's  each 1 gig memory
> 1 - 2100 4/233 Alpha    512 mb memory
> 2 - 1000 4/200 Alpha's used as disk servers   128 mb memory each
> Fully redundant HS121 disk controller with 20  4.3 gig disks in an SW800

> (The 2 - 2000 Alpha's run quite smoothly as they are currently configured.)

>    One of the questions is what would show us the biggest improvement? The
> purchase of an addition CPU for the 2100? Additional memory for the the 2100
> and/or the 2 disk servers?

>    Currently the CPU usage is running about 80% on the 2100 with memory
> averaging approx. 75%. Will adding another CPU help release memory or will
> an additional CPU need to be installed in conjunction of addition memory?

>    The 2 servers are being pelted by heavy CPU usage due to the running of
> defragmentaion processes on them. Memory seems to be sufficent but will
> increasing the amount of memory on these servers help free-up the CPU's?

>    As you can tell, I am grasping at straws. I am just not sure what I need...
> Please excuse my stupidity on this topic. I am not sure what the difference
> the results would be if I added addition CPU's with or without additional
> memory (or vise-a-versa)? I don't want to waste our money on items that are
> not needed..

>    What is a good rule to follow to know what is needed to be upgraded and/or
> when to upgrade CPU's or Memory? Perhaps there isn't a rule....

Sigh. It wasn't very long ago that this would have been a configuration to
upgrade *to*, not away from. But seriously...

There is no one rule, but there are a few rules of thumb that can be helpful.

The key thing is to find out which resource is the real bottleneck. The
choice is generally cpu, memory, or i/o. Sometimes an apparent bottleneck in
one area is really just a by-product of a problem in another area.

Typically, if the cpu is the bottleneck, you'll sustain long compute queues
(# of processes in COM or COMO state). If that's more than about 3-5 per
processor, you might need more processing power -- unless most of your cpu
time comes from handling problems in other areas. You mention that the cpu is
very busy defragging disks. I'd look at that problem before I added more
processing power. For example, maybe you can change your defragging style or
rearrange your disk layout. Btw, high cpu utilization does not necessarily
mean you have a problem. It's like bank tellers: a busy teller is a good
thing; the real issue is how long the waiting line is.

A given process has a memory bottleneck if it stabilizes at a high page fault
rate with its working set size = wsextent. Try cranking up its wsextent to
wsmax; you might also want to increase wsmax. If the process page faults a
lot without reaching wsextent, either it's suffering from competition with
other processes, or maybe you really don't have enough memory.

There's more to it than this, but I hope this helps.

Also, one more rule of thumb: Remember that whenever you solve your #1
performance problem, #2 gets a promotion.


System Solutions Incorporated (http://www.syssol.com)
ESILUG Chair (http://www.decus.org/decus/lugs/esilug)

 
 
 

Performance issues - recommendations?

Post by Brendan Welch, W1L » Wed, 18 Dec 1996 04:00:00


Quote:>    Second,  how much paging and swapping is going on on the 2100?
> You seldom run out of memory on a virtual memory system; if your page
> files are big enough, you never will.  The system just runs slower and
> slower as disk is increasingly substituted for physical memory.  A
> certain amount of paging can not be escaped because executable images
> and shared libraries must be paged in when the image is activated.

I have what I think is a related question, which just came up yesterday.

Surprisingly, I have been able to get away for years without problems with
programs with large dimensions, or even with understanding how a virtual
system _really_ does work.

A physics student brought me a Fortran program which, when he compiled with
dimensions as large as he really wanted, would at the RUN command return
an error message immediately.  This seems to be a function of the
_dimension_ of the array, not with his actually using it; i.e.,
DIMENSION ARRAY (100,100,100) but NPTS still = 42 instead of 100.

In my ignorance I had been changing the parameters of his account.
By default they were JTquota=4096, WSdef=1024, WSquo=2048, WSextent=4096,
Pgflquo=32768, and by cut-and-try I multiplied them by 4, changed the
dimension of the array to (84,84,84), and it all worked.  I tried
watching the free memory (the only stupid way I knew how was via
SHOW CLUSTER) which said 81% usage.  But multiplying the params by 5,
and compiling with dimension 90 failed; by then I suspected I was doing
unnecessary cut and try, and of course that if I _really_ ever knew
what I was doing, I could have saved a lot of trouble.

However, I suspect that a lot of other folks out there are secretly in
the same or worse position.  I for one would appreciate either the
long or short solution, especially about the part which says you
"never" really need enough physical memory.  Was I really overloading
the pagefile?  How can I get to (100,100,100)?

--

 
 
 

Performance issues - recommendations?

Post by Hans Bachn » Thu, 19 Dec 1996 04:00:00



10:53:12 -0500:

<snip>

Quote:>A physics student brought me a Fortran program which, when he compiled with
>dimensions as large as he really wanted, would at the RUN command return
>an error message immediately.  This seems to be a function of the
>_dimension_ of the array, not with his actually using it; i.e.,
>DIMENSION ARRAY (100,100,100) but NPTS still = 42 instead of 100.

<snip>

Unfortunately you don't quote the error message you get, nor do you
mention the datatype of the array, nor the number of such arrays used
in the program.

As Fortran does not dynamically allocate memory for the array, the
dimensions are responsible for memory consumption, not how much of the
array is actually used by the program.

I have my suspicions, though. Check out the VIRTUALPAGECNT system
parameter which limits the virtual address space a program can use. As
this is not a dynamic parameter, it requires a reboot to see the
effect of a change. Also, how big is the WSMAX system parameter ?

I doubt that you are 'overloading' your page file because page file
usage only occurs if the memory pages in question are actually in use.
So if you use only part of the array, just increasing the dimensions
should (almost) not influence page file usage.

---------------- just my personal thoughts ----------------
Hans Bachner
Digital Equipment Austria
MCS - Software Support


 
 
 

Performance issues - recommendations?

Post by Richard B. Gilber » Thu, 19 Dec 1996 04:00:00



Quote:>>        <I wrote>
>>        Second,  how much paging and swapping is going on on the 2100?
>> You seldom run out of memory on a virtual memory system; if your page
>> files are big enough, you never will.  The system just runs slower and
>> slower as disk is increasingly substituted for physical memory.  A
>> certain amount of paging can not be escaped because executable images
>> and shared libraries must be paged in when the image is activated.

>I have what I think is a related question, which just came up yesterday.

>Surprisingly, I have been able to get away for years without problems with
>programs with large dimensions, or even with understanding how a virtual
>system _really_ does work.

>A physics student brought me a Fortran program which, when he compiled with
>dimensions as large as he really wanted, would at the RUN command return
>an error message immediately.  This seems to be a function of the
>_dimension_ of the array, not with his actually using it; i.e.,
>DIMENSION ARRAY (100,100,100) but NPTS still = 42 instead of 100.

>In my ignorance I had been changing the parameters of his account.
>By default they were JTquota=4096, WSdef=1024, WSquo=2048, WSextent=4096,
>Pgflquo=32768, and by cut-and-try I multiplied them by 4, changed the
>dimension of the array to (84,84,84), and it all worked.  I tried
>watching the free memory (the only stupid way I knew how was via
>SHOW CLUSTER) which said 81% usage.  But multiplying the params by 5,
>and compiling with dimension 90 failed; by then I suspected I was doing
>unnecessary cut and try, and of course that if I _really_ ever knew
>what I was doing, I could have saved a lot of trouble.

>However, I suspect that a lot of other folks out there are secretly in
>the same or worse position.  I for one would appreciate either the
>long or short solution, especially about the part which says you
>"never" really need enough physical memory.  Was I really overloading
>the pagefile?  How can I get to (100,100,100)?

        Well, the array in question appears to be REAL*4 so:
4*10^2*10^2*10^2 = 4*10^6 or 4,000,000 bytes!  That's just for that one
array.  You must also have space for the rest of the program code and
data.  It doesn't all have to be in physical memory at once, of course,
but you do need a virtual address space big enough to hold it all.  The
SYSGEN parameter VIRTUALPAGECNT limits the size of the virtual
address space for any process.  My guess is that you would need a
VIRTUALPAGECNT of 10,000 to 20,000 pages (5-10Mb) in order to link and
run this program.  On a small system, VIRTUALPAGECNT might well be less
than that.  The user will need a PGFLQUO of nearly the same size as
VIRTUALPAGECNT; the .EXE file will serve as the backing store for
the user's code, likewise the .EXE file for each shareable library
involved.  The user's variables, all 4Mb+ of them, will have to be
accomodated by his share of the paging file.

--
*************************************************************************
*                        Here, there be dragons!                        *

*                                                                       *
*                                                Richard B. Gilbert     *
*                                           Computer Systems Consultant *
*************************************************************************

 
 
 

Performance issues - recommendations?

Post by Paddy.O'Br.. » Fri, 20 Dec 1996 04:00:00


--smxr-96121922491918301
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7Bit

Sigh -- another thread which I have only just received a copy of a later
posting.  What I saw was Richard Gilbert's reply to Brendan Welch regarding a
student's problem with a Fortran program.

I have included Richard's post below my sig. for the context.

However, I thought I might just throw in 5 (minimum coinage) Australian cents.

We have some very large (virtual memory-wise) programs that are run on our local
clusters.  Assuming that the error message that Brendan saw was something like
"Quota exceeded", my approach is the following when one particular user (who is
too senior for me to question his programs, but also is required to supply
answers to upper management very quickly, so has no time either to do the
rationalising he would like) experiences this problem.

I ask him to re-link with a .MAP so that I can check the virtual memory of his
program, and then change his PGFLQUO accordingly.  This means that only he has
to logout/in.  Currently his PGFLQUO is at 200,000 and I run my Alpha server
with 500,000 VIRTUALPAGECNT (256 Mb of memory). [My own unprivileged account has
a PGFLQUO of 120,000 because a program that I help to maintain requires 85,000
of virtual memory and using the DECwindows de* adds overhead -- and I'm
fairly happy about the dimensioning and structure of that program :-) ]

The latter parameter was intentionally set high so that I need reboot only
rarely.  This machine is running at 3 X 90% CPU for most of the time.  The
pagefiles only get minimally touched even with several "large" programs running
simultaneously.

I have seen no ill effects of these parameters (though I am only a part-time
mangler -- my boss likes to think that our systems survive with 1% of my time so
that I can devote the other 149% to programming). Nor with similar machinations
on lesser machines.

No doubt, one day I shall have time to learn better (and learn how to re-boot
more often).

I was also under the impression that when a program was linked it needed more
than the process PGFLQUO that the running program would. This appears not to be
so.

Regards, Paddy

Paddy O'Brien,
System Planning,
TransGrid,
PO Box A1000, Sydney South,
NSW 2000, Australia

Tel:   +61 2 284-3063
Fax:   +61 2 284-3148

** Richard Gilbert's response

Quote:>>-<I wrote>
>> -Second,  how much paging and swapping is going on on the 2100?
>> You seldom run out of memory on a virtual memory system; if your page
>> files are big enough, you never will.  The system just runs slower and
>> slower as disk is increasingly substituted for physical memory.  A
>> certain amount of paging can not be escaped because executable images
>> and shared libraries must be paged in when the image is activated.

>I have what I think is a related question, which just came up yesterday.

>Surprisingly, I have been able to get away for years without problems with
>programs with large dimensions, or even with understanding how a virtual
>system _really_ does work.

>A physics student brought me a Fortran program which, when he compiled with
>dimensions as large as he really wanted, would at the RUN command return
>an error message immediately.  This seems to be a function of the
>_dimension_ of the array, not with his actually using it; i.e.,
>DIMENSION ARRAY (100,100,100) but NPTS still = 42 instead of 100.

>In my ignorance I had been changing the parameters of his account.
>By default they were JTquota=4096, WSdef=1024, WSquo=2048, WSextent=4096,
>Pgflquo=32768, and by cut-and-try I multiplied them by 4, changed the
>dimension of the array to (84,84,84), and it all worked.  I tried
>watching the free memory (the only stupid way I knew how was via
>SHOW CLUSTER) which said 81% usage.  But multiplying the params by 5,
>and compiling with dimension 90 failed; by then I suspected I was doing
>unnecessary cut and try, and of course that if I _really_ ever knew
>what I was doing, I could have saved a lot of trouble.

>However, I suspect that a lot of other folks out there are secretly in
>the same or worse position.  I for one would appreciate either the
>long or short solution, especially about the part which says you
>"never" really need enough physical memory.  Was I really overloading
>the pagefile?  How can I get to (100,100,100)?

-Well, the array in question appears to be REAL*4 so:
4*10^2*10^2*10^2 = 4*10^6 or 4,000,000 bytes!  That's just for that one
array.  You must also have space for the rest of the program code and
data.  It doesn't all have to be in physical memory at once, of course,
but you do need a virtual address space big enough to hold it all.  The
SYSGEN parameter VIRTUALPAGECNT limits the size of the virtual
address space for any process.  My guess is that you would need a
VIRTUALPAGECNT of 10,000 to 20,000 pages (5-10Mb) in order to link and
run this program.  On a small system, VIRTUALPAGECNT might well be less
than that.  The user will need a PGFLQUO of nearly the same size as
VIRTUALPAGECNT; the .EXE file will serve as the backing store for
the user's code, likewise the .EXE file for each shareable library
involved.  The user's variables, all 4Mb+ of them, will have to be
accomodated by his share of the paging file.

--
*************************************************************************
*                        Here, there be dragons!                        *

*                                                                       *
*                                                Richard B. Gilbert     *
*                                           Computer Systems Consultant *
*************************************************************************

--smxr-96121922491918301
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7Bit

VMSmail To information: torca::MRGATE::"TELEMEMO::2=telememo::3=oz.au::*RFC-822\Info-VAX(a)Mvb.Saic.Com::1=au"
VMSmail CC information: OBRIEN

--smxr-96121922491918301--

 
 
 

Performance issues - recommendations?

Post by Hein RMS van den Heuv » Sat, 28 Dec 1996 04:00:00



>purchase of an addition CPU for the 2100? Additional memory for the the 2100
:
>   Currently the CPU usage is running about 80% on the 2100 with memory
>averaging approx. 75%. Will adding another CPU help release memory or will

Depending on free list goal settings, memory at 75% my mean that processes
where simply allowed to get all they wanted without ever being squeezed.
There may well be an opportunity to trim processes down (kick out init
and error handling code and such) without hurting performance.

Adding a CPU will not release memory, but it may give you the power to
handle more pagefaults should the application memory need grow.

I'd go for more and/or faster CPUs. Can you upgrade to 5/300 boards?

Quote:>   The 2 servers are being pelted by heavy CPU usage due to the running of
>defragmentaion processes on them. Memory seems to be sufficent but will

Stop running defraggers? Spend more time thinking how to prevent fragmentation?
(larger default file extents? Larger chunk sized? seperate small files / big
files in selected drives (or partitions if you knw how to get those)

Are those MSCP servers? In that case I suppose more VCC cache isn't going
to help. For a file server, you may bump the VCC from its default of 3 meg
to say 20% of the memory.

Hope this helps,                +--------------------------------------+
Hein van den Heuvel, Digital.   | All opinions expressed are mine, and |
  "Makers of VMS and other | may not reflect those of my employer |
   fine Operating Systems."        +--------------------------------------+