high values for smtx in mpstat (Solaris performance question)

high values for smtx in mpstat (Solaris performance question)

Post by legin1.. » Fri, 10 Feb 2006 13:43:56



hi there,

I'm looking for some performance tuning tips to do with digging deeper
into mpstat and lockstat.

I have a machine performing poorly and I can find no info on the web
about where to go with lockstat output.  Details are as follows:

The machine has 2 cpus & 4 GB of ram

# psrinfo -v
Status of virtual processor 0 as of: 02/07/2006 16:11:30
  on-line since 12/11/2005 10:05:16.
  The sparcv9 processor operates at 1281 MHz,
        and has a sparcv9 floating point processor.
Status of virtual processor 1 as of: 02/07/2006 16:11:30
  on-line since 12/11/2005 10:05:15.
  The sparcv9 processor operates at 1281 MHz,
        and has a sparcv9 floating point processor.
# prtconf -v | grep Memo
Memory size: 4096 Megabytes

The load average is quite high

$ sar -q -f /var/adm/sa/sa20060206
<snip>
08:00:01    29.6      39     0.0       0
08:20:00    34.6      45     0.0       0
08:40:01    34.6      52     0.0       0
09:00:00    39.1      75     0.0       0
09:20:00    40.5      55     0.0       0
09:40:00    38.6      57     0.0       0
10:00:00    43.7      72     0.0       0
10:20:00    28.7      85     0.0       0
10:40:00    26.7      92     0.0       0
11:00:00    41.6      64     0.0       0
11:20:00    45.2      37     0.0       0
11:40:00    57.0      37     0.0       0
12:00:01    70.0      40     0.0       0
12:20:00    58.8      42     0.0       0
12:40:01    51.0      28     0.0       0
13:00:00    75.3      23     0.0       0
13:20:00    52.3      24     0.0       0
13:40:00    41.2      46     0.0       0
14:00:00    54.8      47     0.0       0
14:20:00    34.0      57     0.0       0
14:40:00    53.7      47     0.0       0
15:00:00    43.3      58     0.0       0
15:20:00    41.0      40     0.0       0
15:40:00    53.5      36     0.0       0
16:00:00    64.0      22     0.0       0
16:20:00    44.6      17     0.0       0
16:40:00    76.0       9     0.0       0
17:00:00    72.2      11     0.0       0
17:20:00    52.9       7     0.0       0
<snip>
Average     40.6      24     0.0       0

The cpu utilisation is high, but not 100% all day:

$ sar -u -f /var/adm/sa/sa20060206
<snip>
00:00:00    %usr    %sys    %wio   %idle
<snip>
08:00:01      48      19      26       6
08:20:00      44      21      25      10
08:40:01      50      22      23       6
09:00:00      56      28      16       0
09:20:00      49      24      26       1
09:40:00      54      24      19       3
10:00:00      62      25      11       2
10:20:00      73      22       5       0
10:40:00      76      21       3       0
11:00:00      64      23       8       5
11:20:00      41      20      20      19
11:40:00      38      20      15      26
12:00:01      35      28       9      28
12:20:00      49      21      12      18
12:40:01      39      16      12      34
13:00:00      26      20       7      46
13:20:00      32      16      18      33
13:40:00      53      21      18       8
14:00:00      44      25      11      21
14:20:00      54      24      16       6
14:40:00      44      25      17      14
15:00:00      51      25       8      15
15:20:00      48      21      24       6
15:40:00      50      18      17      16
16:00:00      31      17      15      37
16:20:00      31      14      26      29
16:40:00      16       9       7      67
17:00:00      32      11       5      53
<snip>
Average       32      14      19      35

Normally, my analysis of a high load average combined with non-zero
idle cpu would be that processes waiting on small i/o's are being
included in the run queue.   However, the machine is noticeably slow to
use, so I don't think it's a case of just one metric being out of
whack.

It would seem that there is contention for CPU resources & kernel
mutexes:

# mpstat 2 20
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys
wt idl
  0  193  34  196    93    2   54  162  311   40    0   215   20  10
16  54
  1  203  35  468   503  359  663  189  311  125    0   783   22  11
16  51
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys
wt idl
  0  590 136 1535   131    1 6970  146 1834 1251    4 19560   32  12
47   8
  1  755 108 1381   654  520 8379  130 1926 1233    1 20421   32  18
43   6
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys
wt idl
  0  141 116 1213   111    1 7477  116 2247 1497    0 14384   25  18
46  10
  1  400 144 1593   699  563 8843  177 2167 1498    0 18754   36  14
43   7
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys
wt idl
  0   96 120 1317   114    1 6972  138 1974 1246    0 18495   17  11
56  16
  1  213 113 1445   734  602 7616  124 1945 1291    0 17512   25  13
54   8
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys
wt idl
  0   57 127 1208   134    1 7839  159 2001 1474    0 15872   31  13
50   6
  1    7 148 1430   704  572 8065  131 2016 1412    0 18480   27  15
54   4
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys
wt idl
  0  215 164 1517   169    1 8130  199 2204 1164    0 18284   39  11
48   2
  1  321  88 1139   694  560 7671  135 2193 1480    0 14894   36  22
35   7

The high value for stalled mutexes (smtx) leads me to lockstat, which
shows that one system call is generating almost all the mutex stalls
and the mutex blocks:

# lockstat -i 971 sleep 15

Adaptive mutex spin: 44962 events in 15.148 seconds (2968 events/sec)

Count indv cuml rcnt     spin Lock                   Caller
-------------------------------------------------------------------------------
43097  96%  96% 1.00        2 0x30000056568          cv_wait_sig+0x1a4
<snip>
Adaptive mutex block: 239 events in 15.148 seconds (16 events/sec)

Count indv cuml rcnt     nsec Lock                   Caller
-------------------------------------------------------------------------------
  232  97%  97% 1.00   163433 0x30000056568          cv_wait_sig+0x1a4
<snip>

The high value for xcal (in mpstat) might be reduced by increasing
segmap_percent, but I find this unlikely as the memory situation is
generally healthy, ie the file cache has free memory available to it:

$ /opt/RMCmem/bin/prtmem

Total memory:            3916 Megabytes
Kernel Memory:            544 Megabytes
Application:             1394 Megabytes
Executable & libs:         21 Megabytes
File Cache:               400 Megabytes
Free, file cache:        1551 Megabytes
Free, free:                 6 Megabytes

$

Virtual Adrian has nothing to say except for cpu busy:

# /opt/RICHPse/bin/se virtual_adrian.se
<snip>
Adrian detected CPU overload (red): Tue Feb  7 16:14:12 2006
CPU overload
  procs        memory         page             faults              cpu
 r  b  w    swap    free    pi  po  sr  rt  in    sy    cs  smtx us sy
id
17  1  0 10602132 1457662  3756  28   0 600 785 31207 16015  2789 33 15
53

Adrian detected CPU overload (amber): Tue Feb  7 16:14:42 2006
CPU busy
  procs        memory         page             faults              cpu
 r  b  w    swap    free    pi  po  sr  rt  in    sy    cs  smtx us sy
id
 6  1  0 10598704 1458007  2911  24   0 600 793 37778 16319  2923 28 13
59

Adrian detected CPU overload (red): Tue Feb  7 16:15:12 2006
CPU overload
  procs        memory         page             faults              cpu
 r  b  w    swap    free    pi  po  sr  rt  in    sy    cs  smtx us sy
id
29  1  0 10597442 1456310  3462  50   0 600 827 35931 16184  2740 37 17
46

The load on network, memory and disk seems negligible.  I have another
12 machines with identical hardware, identical o/s configuration (built
by Flash), and very similar database configuration and user usage.
None of them are experiencing these problems.

Any ideas on how I can progress the investigation into cv_wait_sig?
Truss doesn't show kernel calls, and I don't have DTrace as this is
Solaris 9, not Solaris 10.

Any help appreciated,     (especially from legends like Casper Dik,
Brendan Gregg or Adrian!!!)  :)

cheers
nf

 
 
 

high values for smtx in mpstat (Solaris performance question)

Post by Hans » Sat, 11 Feb 2006 21:23:11


snip...
Quote:

> The load on network, memory and disk seems negligible.  I have another
> 12 machines with identical hardware, identical o/s configuration (built
> by Flash), and very similar database configuration and user usage.
> None of them are experiencing these problems.

> Any ideas on how I can progress the investigation into cv_wait_sig?
> Truss doesn't show kernel calls, and I don't have DTrace as this is
> Solaris 9, not Solaris 10.

> Any help appreciated,     (especially from legends like Casper Dik,
> Brendan Gregg or Adrian!!!)  :)

> cheers
> nf

-------------------------------------------------------------------

Since there is an ongoing discussion on the performance effects of memory
interleave, why don't you compare the memory configurations on your machines
(prtdiag).  Maybe some difference will turn up.

 
 
 

high values for smtx in mpstat (Solaris performance question)

Post by Brendan Greg » Thu, 16 Feb 2006 12:46:15


G'Day,

On 8 Feb 2006 legin1...@yahoo.com wrote:

> hi there,

> I'm looking for some performance tuning tips to do with digging deeper
> into mpstat and lockstat.

> I have a machine performing poorly and I can find no info on the web
> about where to go with lockstat output.  Details are as follows:

> The machine has 2 cpus & 4 GB of ram

> # psrinfo -v
> Status of virtual processor 0 as of: 02/07/2006 16:11:30
>   on-line since 12/11/2005 10:05:16.
>   The sparcv9 processor operates at 1281 MHz,
>         and has a sparcv9 floating point processor.
> Status of virtual processor 1 as of: 02/07/2006 16:11:30
>   on-line since 12/11/2005 10:05:15.
>   The sparcv9 processor operates at 1281 MHz,
>         and has a sparcv9 floating point processor.
> # prtconf -v | grep Memo
> Memory size: 4096 Megabytes

> The load average is quite high

> $ sar -q -f /var/adm/sa/sa20060206
> <snip>
> 08:00:01    29.6      39     0.0       0
> 08:20:00    34.6      45     0.0       0
[...]
> <snip>
> Average     40.6      24     0.0       0

Wow, that's a huge runq-sz (40), and with a %runocc of 24 it sounds
like the work load is in bursts. Maybe a deamon is creating a ton of
worker processes or threads for a request? (I've seen this sort of thing
for poorly written CGI on web servers, where they are creating a ton of
perl processes but not using modperl).

Since it's Solaris 9, we can't run a ton of DTrace scripts that would
help. Hmm. If it's worker processes, try running "sar -c", to see if many
exec()s are happening. The other system call types will also be
interesting. It may also help to initialise process accounting for a
short period, to capture data on short lived processes,

        /usr/lib/acct/startup
        sleep 15
        /usr/lib/acct/shutacct
        lastcomm

process accounting's details are sketchy, but it's usually too much effort
to configure BSM auditing to do this properly.

... if it's worker threads, it is more difficult to measure. as it's
Solaris 9, we can reliably run "prstat -m" and look at the NLWP count to
see how many threads a process has. In fact, prstat -m is interesting
regardless, and I think you've tried everywhere else!

- Show quoted text -

> The cpu utilisation is high, but not 100% all day:

> $ sar -u -f /var/adm/sa/sa20060206
> <snip>
> 00:00:00    %usr    %sys    %wio   %idle
> <snip>
> 08:00:01      48      19      26       6
> 08:20:00      44      21      25      10
[...]
> <snip>
> Average       32      14      19      35

> Normally, my analysis of a high load average combined with non-zero
> idle cpu would be that processes waiting on small i/o's are being
> included in the run queue.   However, the machine is noticeably slow to
> use, so I don't think it's a case of just one metric being out of
> whack.

ok, so a ton of processes leap onto the run queues when i/o is available.
Gee, I'd love DTrace right now.....

- Show quoted text -

> It would seem that there is contention for CPU resources & kernel
> mutexes:

> # mpstat 2 20
> CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys
> wt idl
>   0  193  34  196    93    2   54  162  311   40    0   215   20  10
> 16  54
>   1  203  35  468   503  359  663  189  311  125    0   783   22  11
> 16  51
> CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys
> wt idl
>   0  590 136 1535   131    1 6970  146 1834 1251    4 19560   32  12
> 47   8
>   1  755 108 1381   654  520 8379  130 1926 1233    1 20421   32  18
> 43   6
[...]

> The high value for stalled mutexes (smtx) leads me to lockstat, which

there are a fair few xcals as well, which could be a number of things -
including CPUs receiving completed I/Os for waiting threads that are
sitting on another CPU's dispatcher queue (which may well explain
the cv_wait_sig() lock contention below). Or, the xcals are for cache
coherency due to terminating worker processes or threads (explaining the
high bursty runq-sz).

(needless to say, DTrace can easily examine xcalls!)

- Show quoted text -

> shows that one system call is generating almost all the mutex stalls
> and the mutex blocks:

> # lockstat -i 971 sleep 15

> Adaptive mutex spin: 44962 events in 15.148 seconds (2968 events/sec)

> Count indv cuml rcnt     spin Lock                   Caller
> -------------------------------------------------------------------------------
> 43097  96%  96% 1.00        2 0x30000056568          cv_wait_sig+0x1a4
> <snip>
> Adaptive mutex block: 239 events in 15.148 seconds (16 events/sec)

> Count indv cuml rcnt     nsec Lock                   Caller
> -------------------------------------------------------------------------------
>   232  97%  97% 1.00   163433 0x30000056568          cv_wait_sig+0x1a4
> <snip>

Good work. Try "lockstat -i 971 -s 20 -D 3 sleep 15", to get a larger
stack backtrace for the caller.

> The high value for xcal (in mpstat) might be reduced by increasing
> segmap_percent, but I find this unlikely as the memory situation is
> generally healthy, ie the file cache has free memory available to it:

> $ /opt/RMCmem/bin/prtmem

> Total memory:            3916 Megabytes
> Kernel Memory:            544 Megabytes
> Application:             1394 Megabytes
> Executable & libs:         21 Megabytes
> File Cache:               400 Megabytes
> Free, file cache:        1551 Megabytes
> Free, free:                 6 Megabytes

Good stuff, no problems there. (oh - in Sol 9 this can now be run as
"::memstat" from mdb -k).

- Show quoted text -

> Virtual Adrian has nothing to say except for cpu busy:

> # /opt/RICHPse/bin/se virtual_adrian.se
> <snip>
> Adrian detected CPU overload (red): Tue Feb  7 16:14:12 2006
> CPU overload
>   procs        memory         page             faults              cpu
>  r  b  w    swap    free    pi  po  sr  rt  in    sy    cs  smtx us sy
> id
> 17  1  0 10602132 1457662  3756  28   0 600 785 31207 16015  2789 33 15
> 53
[...]

> The load on network, memory and disk seems negligible.  I have another
> 12 machines with identical hardware, identical o/s configuration (built
> by Flash), and very similar database configuration and user usage.
> None of them are experiencing these problems.

oh - this is a database! hmmm. worker threads? ...

> Any ideas on how I can progress the investigation into cv_wait_sig?
> Truss doesn't show kernel calls, and I don't have DTrace as this is
> Solaris 9, not Solaris 10.

more stack backtrace as above. DTrace would make this easy, but... :)

no worries,

Brendan

[Sydney, Australia]

- Show quoted text -

> Any help appreciated,     (especially from legends like Casper Dik,
> Brendan Gregg or Adrian!!!)  :)

> cheers
> nf

 
 
 

high values for smtx in mpstat (Solaris performance question)

Post by legin1.. » Thu, 23 Feb 2006 09:36:02


Brendan,

thanks for the great tips!!   By the way, I'm still digesting all the
good info I learnt from you at a solaris performance tuning course in
Brisbane a few years ago.

cheers
Nigel

 
 
 

high values for smtx in mpstat (Solaris performance question)

Post by legin1.. » Thu, 23 Feb 2006 09:49:38


A couple of issues that have turned up:

1.   one of the 2 channels to my storage array (se3310) slipped from
80M to ASYNC mode thanks to bug 5007911.   Have used vxdmpadm iostat,
and vxdmpadm setattr ... iopolicy to prove and work-around the speed
slowdown until we can schedule a reboot.

2.   segmap_percent is set to 12%, and ::memstat indicates that the
cache never grows beyond 12%.     The document
http://docs.sun.com/source/817-1836-10/perfplat.html says that:

"default value is 12; that is, the kernel will reserve enough space to
map at most 12% of memory for the file system cache"

key phrase being "at most".  This wasn't my understanding previously.
Does this gel with anyone else's experience?

cheers
Nigel

 
 
 

high values for smtx in mpstat (Solaris performance question)

Post by Richard L. Hamilt » Sat, 25 Feb 2006 11:08:18




Quote:> A couple of issues that have turned up:

> 1.   one of the 2 channels to my storage array (se3310) slipped from
> 80M to ASYNC mode thanks to bug 5007911.   Have used vxdmpadm iostat,
> and vxdmpadm setattr ... iopolicy to prove and work-around the speed
> slowdown until we can schedule a reboot.

> 2.   segmap_percent is set to 12%, and ::memstat indicates that the
> cache never grows beyond 12%.     The document
> http://docs.sun.com/source/817-1836-10/perfplat.html says that:

> "default value is 12; that is, the kernel will reserve enough space to
> map at most 12% of memory for the file system cache"

> key phrase being "at most".  This wasn't my understanding previously.
> Does this gel with anyone else's experience?

Perhaps that's only part of the key; the other might be "reserve". I
gather that controls a special kernel data segment that provides preferred
performance for cacheing read(), write(), directory contents, etc, which
is created fairly early in system startup and remains of constant size
while the OS is up.

I think you can mmap() a large regular file, and unmap it, and the
pages will stay in memory as long as they aren't needed for something
else, even if they don't use the segmap driver and the segkmap segment
at all.

As to what _always_ goes through segmap, what never goes through it,
and what if anything may optionally go through it, I'm not so clear;
if there's a short, plain English (i.e. not C) answer to that, I
haven't seen it; so just what will and won't benefit, and what the
tradeoff is, I'm not totally sure.

--

Lasik/PRK theme music:
    "In the Hall of the Mountain King", from "Peer Gynt"

 
 
 

1. mpstat shows very high smtx values on nearly idle machine, why?

Greetings,

I have a SS20/72 here (two CPUs) running Solaris 2.4 and two Sparc
Storage Arrays under Veritas Volume Manager 2.1 on it (newes
Firmware and all).
When only few users are working on the machine, all performance monitoring
values look idle (vmstat, iostat, vxstat for the arrays, netstat).

Only mpstat shows very high smtx (spins on mutexes) values, between
1000 and 1500 per second. This should be under 200 per second to
be ok and under 400 per second to be only in warning regions, as
A. Cockcrofts book and tools say.

Can anybody explain, why a nearly idle machine shows smtx values this
high?

Thanks, Erik

2. Resolution enhancement on LCDs.

3. what is impact of high xcal value in mpstat

4. limit simultanious sessions

5. High kernel usage (top and mpstat) when doing big reads on a SAN

6. FAQ? Want DSL + Wireless: Need router or just modem?

7. Unix-Mpstat-xcal(high) urgent

8. Text to speech reader

9. High Performance TCP/IP & Solaris 2.3

10. Solaris versions and performance information (+slow Oracle + high WAIT I/O)

11. Seeking High Performance Backup/Restore Solution Solaris 8

12. tcp question (performance over high latency networks)