IO Time wait issues and Tivoli.

IO Time wait issues and Tivoli.

Post by Dan Cav » Tue, 27 Jun 2006 18:21:20



Hi folks,

We have a collection of servers (old cranky 280r's) - 2gb ram.
Sometimes the IO/CPU wait figures go over 20 or so and our Tivoli
monitoring element which looks at these figures is set to raise a
warning when it goes over 10.

vis :
http://www-1.ibm.com/support/docview.wss?rs=2077&context=SW730&contex...

In your experiences, at what point/figure should one be worried/alerted
when the IO/CPU wait figure gets high?

Thanks in advance.

Dan.

 
 
 

IO Time wait issues and Tivoli.

Post by Casper H.S. Di » Tue, 27 Jun 2006 18:31:26



>Hi folks,
>We have a collection of servers (old cranky 280r's) - 2gb ram.
>Sometimes the IO/CPU wait figures go over 20 or so and our Tivoli
>monitoring element which looks at these figures is set to raise a
>warning when it goes over 10.
>vis :
>http://www-1.ibm.com/support/docview.wss?rs=2077&context=SW730&contex...
>In your experiences, at what point/figure should one be worried/alerted
>when the IO/CPU wait figure gets high?

I/O wait, you mean?  We've dropped the metric in S10 because we felt
it was useless.

So perhaps not warn for any number of I/O wait?

Casper
--
Expressed in this posting are my opinions.  They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.

 
 
 

IO Time wait issues and Tivoli.

Post by Dan Cav » Tue, 27 Jun 2006 19:14:02


Hi Casper,

I should have qualified my last post a little more, we're still in the
dark ages here and are running on solaris 8(bless it's cotton socks) ;)

After having  the Apps team upgrade Tivoli to FixPack6, We've had
occasions where some of our servers reach "HighCPUWaitTime" wait
figures over 20..  This kicks off Tivoli's warning alerts (which i
usually take with a pinch of salt), anything over 10 -which is the
default as we never saw these messages prior to FP4..

In the IBM infodoc, it associates HighCPUWaitTime with High I/O, which
I understand but don't really know what an idea figure should be.

Anyhow, what I would like to know is  at what point would you start
getting concerned with a servers performance CPUWait vs IO wait?

I,e, can I change the threshold from 10 (default) to something like 30?

tia.

d.


> >Hi folks,

> >We have a collection of servers (old cranky 280r's) - 2gb ram.
> >Sometimes the IO/CPU wait figures go over 20 or so and our Tivoli
> >monitoring element which looks at these figures is set to raise a
> >warning when it goes over 10.

> >vis :
> >http://www-1.ibm.com/support/docview.wss?rs=2077&context=SW730&contex...

> >In your experiences, at what point/figure should one be worried/alerted
> >when the IO/CPU wait figure gets high?

> I/O wait, you mean?  We've dropped the metric in S10 because we felt
> it was useless.

> So perhaps not warn for any number of I/O wait?

> Casper
> --
> Expressed in this posting are my opinions.  They are in no way related
> to opinions held by my employer, Sun Microsystems.
> Statements on Sun products included here are not gospel and may
> be fiction rather than truth.

 
 
 

IO Time wait issues and Tivoli.

Post by Casper H.S. Di » Tue, 27 Jun 2006 21:26:19



>In the IBM infodoc, it associates HighCPUWaitTime with High I/O, which
>I understand but don't really know what an idea figure should be.

Except that it does not really mean that (it basically means that,
yes, the system is doing I/O, but also that it has * all to do
at the same time because the CPU is idle)

Quote:>Anyhow, what I would like to know is  at what point would you start
>getting concerned with a servers performance CPUWait vs IO wait?

What the hell is "CPU wait"?  There's no such metric in Solaris.

We'd never get concerned about I/O wait; only concerned when I/O
latency is high and some of the devices are at or near max capacity.

Casper
--
Expressed in this posting are my opinions.  They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.

 
 
 

IO Time wait issues and Tivoli.

Post by Dan Cav » Tue, 27 Jun 2006 22:07:59


Casper

thanks for replying.

Quote:> What the hell is "CPU wait"?  There's no such metric in Solaris.

apparently, "the percentage of cpu in wait- the maximum percentage of
time that the processor should be waiting on I/O to maintain
satisfactory system performance. A frequency high value indicates that
there could be an I/O bottleneck in the system."

Quote:

> We'd never get concerned about I/O wait; only concerned when I/O
> latency is high and some of the devices are at or near max capacity.

I think IBM have made this up... a case of blowing smoke up your
trousers and telling you that your butt's on fire... perhaps?
 
 
 

IO Time wait issues and Tivoli.

Post by Dextho » Wed, 28 Jun 2006 06:11:18




> >In the IBM infodoc, it associates HighCPUWaitTime with High I/O, which
> >I understand but don't really know what an idea figure should be.

> Except that it does not really mean that (it basically means that,
> yes, the system is doing I/O, but also that it has * all to do
> at the same time because the CPU is idle)

> >Anyhow, what I would like to know is  at what point would you start
> >getting concerned with a servers performance CPUWait vs IO wait?

> What the hell is "CPU wait"?  There's no such metric in Solaris.

> We'd never get concerned about I/O wait; only concerned when I/O
> latency is high and some of the devices are at or near max capacity.

> Casper
> --
> Expressed in this posting are my opinions.  They are in no way related
> to opinions held by my employer, Sun Microsystems.
> Statements on Sun products included here are not gospel and may
> be fiction rather than truth.

In my experience, I've seen that a default value for "sd_max_throttle"
and a box with decent I/O workload will show high I/O wait. Lowering
the sd_max_throttle to 20 or something that vendor specifies keeps that
% low too.

Was my perception on right grounds ?

-Dexthor.

 
 
 

IO Time wait issues and Tivoli.

Post by Casper H.S. Di » Wed, 28 Jun 2006 17:40:45



>In my experience, I've seen that a default value for "sd_max_throttle"
>and a box with decent I/O workload will show high I/O wait. Lowering
>the sd_max_throttle to 20 or something that vendor specifies keeps that
>% low too.
>Was my perception on right grounds ?

But that change may actually have lowered performance and throughput.

Casper
--
Expressed in this posting are my opinions.  They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.

 
 
 

IO Time wait issues and Tivoli.

Post by Dextho » Wed, 28 Jun 2006 23:03:48




> >In my experience, I've seen that a default value for "sd_max_throttle"
> >and a box with decent I/O workload will show high I/O wait. Lowering
> >the sd_max_throttle to 20 or something that vendor specifies keeps that
> >% low too.

> >Was my perception on right grounds ?

> But that change may actually have lowered performance and throughput.

> Casper
> --
> Expressed in this posting are my opinions.  They are in no way related
> to opinions held by my employer, Sun Microsystems.
> Statements on Sun products included here are not gospel and may
> be fiction rather than truth.

Shouldnt we go by vendor's recommended specs for the Disk Queue size ?
In my case, working with EMC storage frames, I had better performance
overall and not have my boxes smoke by lowering sd_max_throttle. The
default, if I remember is 127/128 which is kinda high for a single
Disk.  Which inturn will showup as a single disk which is high on %busy
in iostats.

Unless one has very well laid out stripset with large # of spindles,
under heavy I/O workload default sd_max_throttle will result in issues.

I got into this while analyzing an Oracle DWH I/O performance tuning
exercise. Without Direct I/O or Raw I/O access to the disk, Solaris was
performing very poorly simulating asynchronous I/O (with threads).
Under sustained I/O load, we would see high Load build up, heavy
context switching.

EMC's recommendations for Kernel settings did improve and addition
volume relayouts helped us. Lowering sd_max_throttle was one of those.

This could be just applicable in my case, though I had good results.

-Dexthor.

 
 
 

IO Time wait issues and Tivoli.

Post by Darren Dunha » Thu, 29 Jun 2006 04:55:19



> After having  the Apps team upgrade Tivoli to FixPack6, We've had
> occasions where some of our servers reach "HighCPUWaitTime" wait
> figures over 20..  This kicks off Tivoli's warning alerts (which i
> usually take with a pinch of salt), anything over 10 -which is the
> default as we never saw these messages prior to FP4..

iowait in a vacuum doesn't mean much.  Other things besides I/O can
affect it.  A well-running system may have a large I/O figure and a poor
one can have zero.  

Quote:> In the IBM infodoc, it associates HighCPUWaitTime with High I/O, which
> I understand but don't really know what an idea figure should be.

There is no ideal figure.  Potentially, you could examine a system under
normal circumstances and use a rise in that figure as an indication that
something changed, but you won't know if that change was good or bad in
all cases.

Systems where you have few CPU jobs, lots of available CPU, and the CPU
is faster than the I/O will usually show iowait.  Where all those
situations are normal (think a beefy database), then having a raised
iowait would also be normal.

Quote:> Anyhow, what I would like to know is  at what point would you start
> getting concerned with a servers performance CPUWait vs IO wait?

When it's abnormal for what that machine does.  If my CPU-intensive
renderfarm ever showed i/o wait, I'd be concerned.  If my database
server showed it, I wouldn't be.

Quote:> I,e, can I change the threshold from 10 (default) to something like 30?

Yup.

--

Senior Technical Consultant         TAOS            http://www.taos.com/
Got some Dr Pepper?                           San Francisco, CA bay area
         < This line left intentionally blank to confuse you. >

 
 
 

1. measure io wait time per process

Hello,

Is it possible to find the WAIT time for a process? Using ps I can
find the total CPU time that has been consumed by a process, but I'm
interested in knowing how long the process has spent waiting (e.g. for
io). I know you can get the percentage wait time for a given period
using vmstat, but I want to know per process the cumulative wait time.

I have a situation where the CPU used is about half the execution time
of a piece of code, and I want to account for the difference.

Thanks,
Glen.

2. Btaudio and PixelView PlayTV Pro

3. my SS20 waits for IO most of the time in top

4. Increasing process limits

5. Average wait time per IO request and ...

6. TCP & SCO Enterprise

7. fork,exec and wait and wait and wait and wait

8. Update the kernel with a newer precompiled one: how?

9. Telnet wait & wait & wait &....

10. File IO is slower than a process waiting for it

11. Solaris & IO-wait.

12. High IO Wait

13. IO Wait Adapter usage