top output - making meening of mem

top output - making meening of mem

Post by Amita » Sat, 04 May 2002 21:20:01



Hi

I have a problem. the *top* command documentation says that size and
res define the totla memory utilized and main memory utilized.

On my machine (3500) with 10 GB RAM and 7 processors, I am getting
large list of processes that have memory utilization of 1500M and
above. But swap utilization is only 8-9 GB.

the figiures dont add up. how can I be using more memory that is
available (considering the figure is in MB). Further the process
should not be using so much memory. I searched SUn site with no avail,
niether man top helped.

Another issue, I have very high CPU utilization. can anybodyt point me
at identifying the source of problem.

Thanks

Amitabh

top output sample:

last pid: 18173;  load averages: 36.09, 29.62, 28.14                  
                                                    17:46:19
912 processes: 872 sleeping, 26 running, 1 zombie, 13 on cpu
CPU states:  0.0% idle, 90.2% user,  9.8% kernel,  0.0% iowait,  0.0%
swap
Memory: 10G real, 773M free, 8598M swap in use, 19G swap free

   PID USERNAME THR PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
 12085 oraprod    1  58    0 1580M 1561M sleep    0:07  0.00% oracle
  1492 oraprod   11  54    0 1536M 1515M sleep    3:34  0.61% oracle
 22004 oraprod   11  58    0 1536M 1517M sleep    1:05  0.00% oracle
 29288 oraprod   11  22    0 1535M 1516M sleep    5:26  0.00% oracle
 29127 oraprod   11  58    0 1535M 1515M sleep    1:40  0.00% oracle
 27541 oraprod   11  59    0 1533M 1514M sleep    1:26  0.07% oracle

 
 
 

top output - making meening of mem

Post by Antonio Dell'elc » Sat, 04 May 2002 23:00:16



> Hi

> I have a problem. the *top* command documentation says that size and
> res define the totla memory utilized and main memory utilized.

> On my machine (3500) with 10 GB RAM and 7 processors, I am getting
> large list of processes that have memory utilization of 1500M and
> above. But swap utilization is only 8-9 GB.

> the figiures dont add up. how can I be using more memory that is
> available (considering the figure is in MB). Further the process
> should not be using so much memory. I searched SUn site with no avail,
> niether man top helped.

> Another issue, I have very high CPU utilization. can anybodyt point me
> at identifying the source of problem.

> Thanks

> Amitabh

> top output sample:

> last pid: 18173;  load averages: 36.09, 29.62, 28.14                  
>                                                     17:46:19
> 912 processes: 872 sleeping, 26 running, 1 zombie, 13 on cpu
> CPU states:  0.0% idle, 90.2% user,  9.8% kernel,  0.0% iowait,  0.0%
> swap
> Memory: 10G real, 773M free, 8598M swap in use, 19G swap free

>    PID USERNAME THR PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
>  12085 oraprod    1  58    0 1580M 1561M sleep    0:07  0.00% oracle
>   1492 oraprod   11  54    0 1536M 1515M sleep    3:34  0.61% oracle
>  22004 oraprod   11  58    0 1536M 1517M sleep    1:05  0.00% oracle
>  29288 oraprod   11  22    0 1535M 1516M sleep    5:26  0.00% oracle
>  29127 oraprod   11  58    0 1535M 1515M sleep    1:40  0.00% oracle
>  27541 oraprod   11  59    0 1533M 1514M sleep    1:26  0.07% oracle

The memory of each process, includes, of course, all the shared memory
it uses.

Swap usage seems excessive to me for an healty system.

About CPU usage, check top carefully or a sysadmin.

Antonio

 
 
 

top output - making meening of mem

Post by Alberto da Silv » Sat, 04 May 2002 23:17:38



> Hi

> I have a problem. the *top* command documentation says that size and
> res define the totla memory utilized and main memory utilized.

SNIP

Quote:> last pid: 18173;  load averages: 36.09, 29.62, 28.14
>                                                     17:46:19
> 912 processes: 872 sleeping, 26 running, 1 zombie, 13 on cpu
> CPU states:  0.0% idle, 90.2% user,  9.8% kernel,  0.0% iowait,  0.0%
> swap
> Memory: 10G real, 773M free, 8598M swap in use, 19G swap free

>    PID USERNAME THR PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
>  12085 oraprod    1  58    0 1580M 1561M sleep    0:07  0.00% oracle
>   1492 oraprod   11  54    0 1536M 1515M sleep    3:34  0.61% oracle
>  22004 oraprod   11  58    0 1536M 1517M sleep    1:05  0.00% oracle
>  29288 oraprod   11  22    0 1535M 1516M sleep    5:26  0.00% oracle
>  29127 oraprod   11  58    0 1535M 1515M sleep    1:40  0.00% oracle
>  27541 oraprod   11  59    0 1533M 1514M sleep    1:26  0.07% oracle

The processes are attached to the same shared memory segment (SGA in oracle speak).
Try "pmap" for details of memory used.

With Solaris 8 you can use prstat (similar to top).

Alberto

 
 
 

top output - making meening of mem

Post by Darren Dunha » Sun, 05 May 2002 02:26:09



>> last pid: 18173;  load averages: 36.09, 29.62, 28.14                  
>>                                                     17:46:19
>> 912 processes: 872 sleeping, 26 running, 1 zombie, 13 on cpu
>> CPU states:  0.0% idle, 90.2% user,  9.8% kernel,  0.0% iowait,  0.0%
>> swap
>> Memory: 10G real, 773M free, 8598M swap in use, 19G swap free

>>    PID USERNAME THR PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
>>  12085 oraprod    1  58    0 1580M 1561M sleep    0:07  0.00% oracle
>>   1492 oraprod   11  54    0 1536M 1515M sleep    3:34  0.61% oracle
>>  22004 oraprod   11  58    0 1536M 1517M sleep    1:05  0.00% oracle
>>  29288 oraprod   11  22    0 1535M 1516M sleep    5:26  0.00% oracle
>>  29127 oraprod   11  58    0 1535M 1515M sleep    1:40  0.00% oracle
>>  27541 oraprod   11  59    0 1533M 1514M sleep    1:26  0.07% oracle
> The memory of each process, includes, of course, all the shared memory
> it uses.
> Swap usage seems excessive to me for an healty system.

'swap' in this context means 'virtual memory'.  Since the machine has
10G of RAM, that's a perfectly normal amount.

Of course the load averages of 30+ on a production oracle box would
scare me.   The box is either overloaded (most likely), or there's some
resource contention problem.  

--

Unix System Administrator                    Taos - The SysAdmin Company
Got some Dr Pepper?                           San Francisco, CA bay area
         < This line left intentionally blank to confuse you. >

 
 
 

top output - making meening of mem

Post by Amita » Sun, 05 May 2002 15:56:11


Hi

Thanks fo the reply. I did finally use the pmap to identify the
_missing_ memory. 1.5 GB was being used for SGA, and each oracle
process is actually using 20 MB. I canculated, the finally add up.

However, the load average of 30+ (as Darren  pointed out) is what got
me going on the exercise. I am tryiong to identify the bottleneck. Can
anybody give me some tip or URL on how to find out whether its due to
configuration, etc or the system is simply underpowered to handle the
load.

Amitabh

 
 
 

top output - making meening of mem

Post by Darren Dunha » Wed, 08 May 2002 01:55:45



> Hi
> Thanks fo the reply. I did finally use the pmap to identify the
> _missing_ memory. 1.5 GB was being used for SGA, and each oracle
> process is actually using 20 MB. I canculated, the finally add up.
> However, the load average of 30+ (as Darren  pointed out) is what got
> me going on the exercise. I am tryiong to identify the bottleneck. Can
> anybody give me some tip or URL on how to find out whether its due to
> configuration, etc or the system is simply underpowered to handle the
> load.

'load' is simply a time average of the number of runnable processes on
the run queue.  

If you have one CPU bound process (like 'while(1){}'), the load on the
machine will tend to be '1'.  If you have 2 running simultaneously, only
one at a time will get CPU time on a single processor machine, but the
load with be '2' (or so).

Whether a load is good or bad depends on the machine.  If you're running
something that is normally I/O bound, then even a load above 0.5 could
indicate trouble.  Other systems may be expected to run above n (where n
is the number of CPUs) for much of the time.

I wouldn't normally expect an oracle database server to run with a high
load.  

A load of 30 tells me that you must have a huge number of oracle
processes running.  What do you get for 'ps -u oracle | wc -l'?  Are
there lots of queries running?

Run 'prstat' and look for jobs in state 'run' or 'cpux'.  Those are the
running jobs.  

In the end, it's up to you.  You can run hundreds of simultaneous jobs
on the machine, but the performance of each will suffer.  There may be
nothing wron with the system, but removing some jobs will likely
increase the performance of those that remain.

--

Unix System Administrator                    Taos - The SysAdmin Company
Got some Dr Pepper?                           San Francisco, CA bay area
         < This line left intentionally blank to confuse you. >

 
 
 

top output - making meening of mem

Post by Amita » Wed, 08 May 2002 15:43:01


Hi
Thanks for the reply.

Quote:> If you have one CPU bound process (like 'while(1){}'), the load on the
> machine will tend to be '1'.  

I dont think there is problem with one process, as CPU usage is pretty
fairly divided across process. There are few process that put in high
load on CPU but for short duration. Futher, we are using prepackaged
product so we can fairly rule out inifinite loops.

Quote:

> Whether a load is good or bad depends on the machine.  If you're running
> something that is normally I/O bound, then even a load above 0.5 could
> indicate trouble.  Other systems may be expected to run above n (where n
> is the number of CPUs) for much of the time.>
> I wouldn't normally expect an oracle database server to run with a high
> load.  

Tha machine is not only database server, but also running application
server (9iAS - Forms Server), so I guess the processing load is also
high, but am not sure it should be so high.

Quote:

> A load of 30 tells me that you must have a huge number of oracle
> processes running.  What do you get for 'ps -u oracle | wc -l'?  Are
> there lots of queries running?

Ths number of oracle as well as form server process are indead high
(~230 form server connections, ~400 DB processes + reports server
processes)

Quote:> You can run hundreds of simultaneous jobs on the machine, but the performance of each will suffer.  There may be nothing wron with the system, but removing some jobs will likely increase the performance of those that remain.

I wish I could remove some jobs, but the system has to support 230~240
concurrent user, and we are running bare mininum for the application
to run.
However, we have come to conclusion that E3500 with 7 CPU and 10GB RAM
is simply not sufficient(???) to support 240 concurrent users. I was
worried that it might be some setup problem with DB and/or kernel
parameters. DB stats are healthy, we are currently investigating
kernel parameters.

thanks

Amitabh

 
 
 

1. Relationship of top output and prstat output?

If top show process X as using 50% of CPU time (say, in a 4 process
machine) and prstat -v -p <pid of X> shows 20% USR and 80% sleep, is
this a direct breakdown of the top output? In other words, is process
X using 50*.20=10% of all system resources in SYS mode and 50*.80=40%
of all system resources sleeping?

Any other advices on the relationship between top output and prstat
output would be appreciated...

2. duplicat entries

3. Permissions on /dev/kmem and /dev/mem for top?

4. Help choosing a minimalist distribution...

5. Top reports ~35000K buf mem--why?

6. meanings of Unix directories

7. memory reported by dmesg and top does not agree with physical mem

8. Thanks and question on bootstrapping

9. Top & Mem Used

10. gcc2.6.3 -fforce-mem makes aha2742 boot fail

11. mem usage according to top out of control

12. Making RH5.1 see more than 64MB, mem=96MB doesn't work!!!!!!!

13. mem usage according to top out of control