Page outs with LOTSA memory: why?

Page outs with LOTSA memory: why?

Post by Jim Hutchis » Wed, 18 Aug 1999 04:00:00



I've read up on everything I can regarding Solaris's memory
management, and can't figure out why, with 768 MB of RAM, that it's
still paging out.  It averages between 24 and 100 at any given time.

The only app it's running is a web cache/proxy product from Network
Appliance, therefore there's a huge number of object files it's
managing.  Here's a clip...

# vmstat 3
 procs     memory            page            disk          faults
cpu
 r b w   swap  free  re  mf pi po fr de sr m0 m1 m2 m3   in   sy   cs
us sy id
 0 0 0 758192 325120  0  74 32 18 55  0  5  1  1  1  0  909 2218 1060
6 16 78
 0 1 0 2636576 23768  0   1 42 13 330 0 35  0  0  0  0 1298 5598 3318
17 35 48
 0 0 0 2636568 23760  0   0  8 58 208 0 18  0  0  0  0 1166 3485 1796
14 30 57
 0 0 0 2636568 23768  0   0 13 13 82  0  6  0  0  0  0  927 3943 1680
11 30 59
 0 0 0 2636568 23768  0   0  0 58 186 0 11  0  0  0  0 1408 4989 2033
16 35 50
 0 1 0 2638040 25584  2 306 74 149 301 0 18 11 10 11 0 1556 4339 1722
22 30 49
 0 0 0 2639208 26560  0   0  5 45 45  0  0  0  0  0  0  984 3539 1589
10 29 61
 0 0 0 2639208 26312  0   0 24 42 42  0  0  0  0  0  0 1446 5102 2840
12 35 52

The box is an E-450 with 2 CPUs.  You'd think it was almost
over-powered...

Can anyone shed light on this?

 
 
 

Page outs with LOTSA memory: why?

Post by Alan Stang » Wed, 18 Aug 1999 04:00:00



> I've read up on everything I can regarding Solaris's memory
> management, and can't figure out why, with 768 MB of RAM, that it's
> still paging out.  It averages between 24 and 100 at any given time.

> The only app it's running is a web cache/proxy product from Network
> Appliance, therefore there's a huge number of object files it's
> managing.  Here's a clip...

> # vmstat 3
>  procs     memory            page            disk          faults
> cpu
>  r b w   swap  free  re  mf pi po fr de sr m0 m1 m2 m3   in   sy   cs
> us sy id
>  0 0 0 758192 325120  0  74 32 18 55  0  5  1  1  1  0  909 2218 1060
> 6 16 78
>  0 1 0 2636576 23768  0   1 42 13 330 0 35  0  0  0  0 1298 5598 3318
> 17 35 48
>  0 0 0 2636568 23760  0   0  8 58 208 0 18  0  0  0  0 1166 3485 1796
> 14 30 57
>  0 0 0 2636568 23768  0   0 13 13 82  0  6  0  0  0  0  927 3943 1680
> 11 30 59
>  0 0 0 2636568 23768  0   0  0 58 186 0 11  0  0  0  0 1408 4989 2033
> 16 35 50
>  0 1 0 2638040 25584  2 306 74 149 301 0 18 11 10 11 0 1556 4339 1722
> 22 30 49
>  0 0 0 2639208 26560  0   0  5 45 45  0  0  0  0  0  0  984 3539 1589
> 10 29 61
>  0 0 0 2639208 26312  0   0 24 42 42  0  0  0  0  0  0 1446 5102 2840
> 12 35 52

> The box is an E-450 with 2 CPUs.  You'd think it was almost
> over-powered...

It is overpowered.  The system, in the state shown above, is essentially
idle.   The average idle time is > 50% and the user time is around 15%
on average.  The rest is probably iowaits....which is a fancy form of
idle time.

Quote:> Can anyone shed light on this?

The po column includes all disk page out traffic.  Simply doing an

open(some file)
write(to that file)
close(some file)

will generate non-zero values in the po column.  This doesn't mean the
system is paging in any way or that the system is short on memory.  It
simply means that data is being written to disk.

The column of interest in this case is sr, which measures how
aggressively the system is searching for some free memory.

Cockroft's book has an excellent discussion of this.

--


 
 
 

Page outs with LOTSA memory: why?

Post by Mike Minamot » Wed, 18 Aug 1999 04:00:00


Jim,

I recalled reading this in*croft/Pettit's Sun Performance and Tuning
book.  I believe the two snippets quoted from there will help you.

"Don't panic when you see page-ins and page-outs in vmstat.
These activities are normal since all filesystem I/O is done by means of
the paging process. Hundreds or thousands of kilobytes paged in and paged
out are not a cause for concern, just a sign that the system is working hard."

This may address your specific concern.

"Use page scanner "sr" activity as your RAM shortage indicator.

When you really are short of memory, the scanner will be running
continuously at a high rate (over 200 pages/second averaged over 30 seconds).
If it runs in separated high-level bursts and you are running Solaris 2.5 or
earlier, make sure you have a recent kernel patch installed-an update paging
algorithm in Solaris 2.5.1 was backported to previous releases."

Hope this helps you...


> I've read up on everything I can regarding Solaris's memory
> management, and can't figure out why, with 768 MB of RAM, that it's
> still paging out.  It averages between 24 and 100 at any given time.

> The only app it's running is a web cache/proxy product from Network
> Appliance, therefore there's a huge number of object files it's
> managing.  Here's a clip...

> # vmstat 3
>  procs     memory            page            disk          faults
> cpu
>  r b w   swap  free  re  mf pi po fr de sr m0 m1 m2 m3   in   sy   cs
> us sy id
>  0 0 0 758192 325120  0  74 32 18 55  0  5  1  1  1  0  909 2218 1060
> 6 16 78
>  0 1 0 2636576 23768  0   1 42 13 330 0 35  0  0  0  0 1298 5598 3318
> 17 35 48
>  0 0 0 2636568 23760  0   0  8 58 208 0 18  0  0  0  0 1166 3485 1796
> 14 30 57
>  0 0 0 2636568 23768  0   0 13 13 82  0  6  0  0  0  0  927 3943 1680
> 11 30 59
>  0 0 0 2636568 23768  0   0  0 58 186 0 11  0  0  0  0 1408 4989 2033
> 16 35 50
>  0 1 0 2638040 25584  2 306 74 149 301 0 18 11 10 11 0 1556 4339 1722
> 22 30 49
>  0 0 0 2639208 26560  0   0  5 45 45  0  0  0  0  0  0  984 3539 1589
> 10 29 61
>  0 0 0 2639208 26312  0   0 24 42 42  0  0  0  0  0  0 1446 5102 2840
> 12 35 52

> The box is an E-450 with 2 CPUs.  You'd think it was almost
> over-powered...

> Can anyone shed light on this?

--
Mike Minamoto
Science Applications International Corporation
 
 
 

Page outs with LOTSA memory: why?

Post by Edmond van A » Thu, 19 Aug 1999 04:00:00


Quote:

> > The box is an E-450 with 2 CPUs.  You'd think it was almost
> > over-powered...

> It is overpowered.  The system, in the state shown above, is essentially
> idle.   The average idle time is > 50% and the user time is around 15%
> on average.  The rest is probably iowaits....which is a fancy form of
> idle time.

Alan,

I disagree on this. We have 2 E4500 boxes. One with an A3500 (+800GB, 16
LUN's), vxfs and hardware RAID5. the other has just one A1000. The box
with the A3500 is idle but shows the following...

11:58:46    %usr    %sys    %wio   %idle
11:58:48       0       0      41      59
11:58:50       0       0      31      69
11:58:52       0       0      30      69
11:58:54       0       0      41      59
11:58:56       0       0      30      70

Average        0       0      35      65

The other one with an A1000, which is idle too, shows this...

11:52:45    %usr    %sys    %wio   %idle
11:52:47       0       0       0     100
11:52:49       0       0       0     100
11:52:51       0       2       2      97
11:52:53       0       0       0     100
11:52:55       0       0       0     100

Average        0       0       0      99

So what is the first machine waiting for? If there is no activity on the
machine whatsoever, why does it show %wio when there are no processes in
the queue?

Both machines are 4 CPU, 1GB memory machines.

Any suggestions anyone?

Kind regards

Edmond van As
Sun TSG Tier 3 Technical Support Group
Lucent Technologies

<-- One day I will wake up, and it will all fit together. -->

 
 
 

Page outs with LOTSA memory: why?

Post by Alan Stang » Thu, 19 Aug 1999 04:00:00



> > > The box is an E-450 with 2 CPUs.  You'd think it was almost
> > > over-powered...

> > It is overpowered.  The system, in the state shown above, is essentially
> > idle.   The average idle time is > 50% and the user time is around 15%
> > on average.  The rest is probably iowaits....which is a fancy form of
> > idle time.

> Alan,

> I disagree on this. We have 2 E4500 boxes. One with an A3500 (+800GB, 16
> LUN's), vxfs and hardware RAID5. the other has just one A1000. The box
> with the A3500 is idle but shows the following...

> 11:58:46    %usr    %sys    %wio   %idle
> 11:58:48       0       0      41      59
> 11:58:50       0       0      31      69
> 11:58:52       0       0      30      69
> 11:58:54       0       0      41      59
> 11:58:56       0       0      30      70

> Average        0       0      35      65

> The other one with an A1000, which is idle too, shows this...

> 11:52:45    %usr    %sys    %wio   %idle
> 11:52:47       0       0       0     100
> 11:52:49       0       0       0     100
> 11:52:51       0       2       2      97
> 11:52:53       0       0       0     100
> 11:52:55       0       0       0     100

> Average        0       0       0      99

> So what is the first machine waiting for? If there is no activity on the
> machine whatsoever, why does it show %wio when there are no processes in
> the queue?

> Both machines are 4 CPU, 1GB memory machines.

> Any suggestions anyone?

I can't tell you what process or kernel thread is generating the disk IO's.
Obviously, something is generating them as they are there.  Note that
something like an NFS server will generate disk activity, even though there
is no process in a queue.

But it should be noted that, by definition, both of these systems are idle.
iowaits's are really idle time for the cpu.  A simple way of seeing this is
to run a small compute bound program, one for each cpu.  You'll see the
system will never have an iowait time in this case....even though lots of io
may be happening.  iowait simply estimates cpu time that could've been used
to compute something, had there been something to compute (I'm simplifying
the details).

The iowait times on Solaris 2.x (not Solaris 7) are incorrectly calculated.
To simplify the discussion, the "wait" time is calculated as if all the cpu's
on a system were waiting, even if only one IO operation is pending.  In other
words, iowait times are greatly over estimated in the older Solaris releases.
on multi-cpu systems.  See
http://www.sunworld.com/sunworldonline/swol-10-1998/swol-10-perf.html for a
discussion.

In any case, the system for which I made my comments was an idle system.
It's disk subsystem may have been a bottleneck and completely limited...but
the cpu's had plenty of idle cycles.  Whether you agree or not, the cpu's are
idle.

--

 
 
 

Page outs with LOTSA memory: why?

Post by Philip Bro » Wed, 25 Aug 1999 04:00:00



>Alan,

>I disagree on this. We have 2 E4500 boxes. One with an A3500 (+800GB, 16
>LUN's), vxfs and hardware RAID5. the other has just one A1000. The box
>with the A3500 is idle but shows the following...

>11:58:46    %usr    %sys    %wio   %idle
>11:58:48       0       0      41      59
>...

>The other one with an A1000, which is idle too, shows this...

>11:52:45    %usr    %sys    %wio   %idle
>11:52:47       0       0       0     100
>11:52:49       0       0       0     100
>11:52:51       0       2       2      97
>...
>Any suggestions anyone?

suggestion:  RAID5 sucks. ditch it for 0+1.
or get a better hardware controller, or gobs more raid cache, etc. etc.
[or turn ON controller cache ;-)]

--
[Trim the no-bots from my address to reply to me by email!]
[ Do NOT email-CC me on posts. Pick one or the other.]
 --------------------------------------------------
The word of the day is mispergitude

 
 
 

1. More page outs than page ins?

Hi,

Would anyone be kind enough to explain to me why systems usually have
more page-outs than page-ins? It has got me stumped for quite some
time. I would think that if 100mb is paged out from memory to disk,
then similarly 100mb should be paged-in back from disk to physical
memory?

Or is my understand of paging-in and paging-out wrong? i.e Page-outs
occur when a page is pushed from main memory to the paging space due
to a variety of reasons (lack of physical memory being one of them)
while page-ins' occur when there is free physical memory and these
pages are pushed back from disk back to the physical memory.

TIA!

2. BSD style signals?

3. Lotsa shared memory

4. chroot for sendmail-8.11

5. Removing banner pages from print outs

6. AIX,OS's,Networks

7. not printing last page of long print outs.

8. GTK update hassle

9. page outs in Solaris 7

10. Looking for links: Why Linux Doesn't Page Kernel Memory?

11. Paging out (Usage of Virtual Memory) of persistant pages

12. Lotsa newbie Q's *sigh*

13. newbie needs help... lotsa questions =)