Striped But Lobsided Disk Utilization.....

Striped But Lobsided Disk Utilization.....

Post by Richard E Sgrigno » Thu, 25 Apr 2002 04:37:51



Have a question regarding what may be causing some disks to be used
more even though striping is in effect.  To be more specific, let me
provide the following config/facts:

        Solaris 2.8
        Veritas Volume Manager 3.2
        Veritas File System 3.4
        SAN Environment (Hitachi 7700E)
        Volume of 200Gb Size Consisting of Six (6) Subdisks of 33.6Gb Each

We have taken six (6) subdisks of 33.6Gb each named as follows:

        datadg01        =       c0t1d0
        datadg02        =       c0t2d0
        datadg03        =       c0t3d0
        datadg04        =       c0t5do
        datadg05        =       c0t6d0
        datadg06        =       c0t7d0

Three (3) of the subdisks are on ONE controller (c0) while the
remaining three (3) are on a SECOND controller (c1).....

These are consolidated into a STRIPED volume of 200Gb.....with 6
columns of 64k units.....bsize=8192,logsize=128,largefiles.....

When conducting activity (i.e. find/cpio) to this volume/filesystem,
and monitoring the activity with an "iostat":

        iostat -xnp 5 | grep "c[01]t[123567]d0$"

I expect to see ALL six disks to be EQUALLY utilized.....

However.....what we are ACTUALLY seeing is datadg01 having an average
access rate of "80"....datadg02 & 03 having an average access rate of
"60".....and the disks on Controller 1 staying between 5 and 20
percent.

Shouldn't these all be somewhat consistent percentage-wise?

What would be some issues to look at to determine why the differences?

Thanks.

Richard

 
 
 

Striped But Lobsided Disk Utilization.....

Post by Darren Dunha » Thu, 25 Apr 2002 05:09:20



Quote:> Have a question regarding what may be causing some disks to be used
> more even though striping is in effect.  To be more specific, let me
> provide the following config/facts:
>    Solaris 2.8
>    Veritas Volume Manager 3.2
>    Veritas File System 3.4
>    SAN Environment (Hitachi 7700E)
>    Volume of 200Gb Size Consisting of Six (6) Subdisks of 33.6Gb Each
> We have taken six (6) subdisks of 33.6Gb each named as follows:
>    datadg01        =       c0t1d0
>    datadg02        =       c0t2d0
>    datadg03        =       c0t3d0
>    datadg04        =       c0t5do
>    datadg05        =       c0t6d0
>    datadg06        =       c0t7d0
> Three (3) of the subdisks are on ONE controller (c0) while the
> remaining three (3) are on a SECOND controller (c1).....

But they all appear in your list as c0.   Why does that happen?  Are you
using DMP or anything like that?  

Quote:> These are consolidated into a STRIPED volume of 200Gb.....with 6
> columns of 64k units.....bsize=8192,logsize=128,largefiles.....
> When conducting activity (i.e. find/cpio) to this volume/filesystem,
> and monitoring the activity with an "iostat":
>    iostat -xnp 5 | grep "c[01]t[123567]d0$"
> I expect to see ALL six disks to be EQUALLY utilized.....
> However.....what we are ACTUALLY seeing is datadg01 having an average
> access rate of "80"....datadg02 & 03 having an average access rate of
> "60".....and the disks on Controller 1 staying between 5 and 20
> percent.

'rate'?  Which field are you talking about?  Are you writing or reading?

Quote:> Shouldn't these all be somewhat consistent percentage-wise?

Maybe.  It depends on what you're looking at and what you're doing.

How does 'vxstat -p -i 5 -g <dg> volume' appear during the same tests?

--

Unix System Administrator                    Taos - The SysAdmin Company
Got some Dr Pepper?                           San Francisco, CA bay area
          < How are you gentlemen!! Take off every '.SIG'!! >

 
 
 

Striped But Lobsided Disk Utilization.....

Post by Richard E Sgrigno » Thu, 25 Apr 2002 21:50:12



> > We have taken six (6) subdisks of 33.6Gb each named as follows:

> >       datadg01        =       c0t1d0
> >       datadg02        =       c0t2d0
> >       datadg03        =       c0t3d0
> >       datadg04        =       c0t5do
> >       datadg05        =       c0t6d0
> >       datadg06        =       c0t7d0
> But they all appear in your list as c0.   Why does that happen?  Are you
> using DMP or anything like that?  

My error.....I just did a copy/paste from the first line.....yes, you
are right.....they should read as follows.....
Quote:> >       datadg01        =       c0t1d0
> >       datadg02        =       c0t2d0
> >       datadg03        =       c0t3d0
> >       datadg04        =       c1t5do
> >       datadg05        =       c1t6d0
> >       datadg06        =       c1t7d0

 
 
 

Striped But Lobsided Disk Utilization.....

Post by Darren Dunha » Fri, 26 Apr 2002 00:18:29



Quote:> My error.....I just did a copy/paste from the first line.....yes, you
> are right.....they should read as follows.....
>> >   datadg01        =       c0t1d0
>> >   datadg02        =       c0t2d0
>> >   datadg03        =       c0t3d0
>> >   datadg04        =       c1t5do
>> >   datadg05        =       c1t6d0
>> >   datadg06        =       c1t7d0

Okay, now post a section of the iostat output.

--

Unix System Administrator                    Taos - The SysAdmin Company
Got some Dr Pepper?                           San Francisco, CA bay area
          < How are you gentlemen!! Take off every '.SIG'!! >

 
 
 

Striped But Lobsided Disk Utilization.....

Post by Richard E Sgrigno » Sat, 27 Apr 2002 04:23:40


I did an iostat with the following options:

     iostat -xnptc 5 | grep "c[01]t123567]d0$"

and resulted in the following (each grouped and at 5-second
intervals).  What I did was start with each being at all
zeroes.....then I ran a job which copied data from one filesystem
(/export/home/test) into this one (/volsas/test).  It appears as the
data was being copied, there is a clear indication that the "c0" disks
are being used much more greatly than the "c1" even though the volume
is striped across all six disks.....

    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c0t3d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c0t2d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c0t1d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c1t5d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c1t7d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c1t6d0

    0.0    1.0    0.0    8.3  0.0  0.0    0.0   15.9   0   2 c0t3d0
    0.4    1.0    3.2   32.1  0.0  0.0    0.0   24.4   0   3 c0t2d0
    0.0    2.4    0.0   49.0  0.0  0.1    0.0   43.2   0   7 c0t1d0
    0.0    7.0    0.0   22.5  0.0  0.0    0.0    4.9   0   3 c1t5d0
    0.4    3.8    3.2   56.2  0.0  0.0    0.0    4.4   0   2 c1t7d0
    0.2    2.0    1.6   45.0  0.0  0.0    0.0    5.4   0   1 c1t6d0

    0.0   10.6    0.0  119.9  0.0  0.2    0.0   19.7   0  13 c0t3d0
    0.0    8.0    0.0   94.4  0.0  0.1    0.0   13.4   0   8 c0t2d0
    0.0    7.4    0.0   79.2  0.0  0.3    0.0   37.6   0  20 c0t1d0
    0.8   11.0    6.4  110.4  0.0  0.1    0.0    6.3   0   5 c1t5d0
    1.0    2.8    8.0   49.2  0.0  0.0    0.0    7.0   0   2 c1t7d0
    0.0    3.2    0.0   46.4  0.0  0.0    0.0    4.1   0   1 c1t6d0

    1.0   13.0    8.0  150.5  0.0  0.2    0.0   16.7   0  15 c0t3d0
    0.0   10.4    0.0  161.7  0.0  0.2    0.0   20.7   0  13 c0t2d0
    0.0   13.6    0.0  178.2  0.0  0.7    0.0   52.7   0  33 c0t1d0
    0.8   12.0    6.4  158.5  0.0  0.1    0.0    5.5   0   5 c1t5d0
    0.0   19.0    0.0  221.3  0.0  0.1    0.0    5.3   0   7 c1t7d0
    0.0   15.4    0.0  203.3  0.0  0.1    0.0    5.9   0   7 c1t6d0

    0.6   29.8    4.8  982.1  0.5  4.6   16.7  150.4  10  58 c0t3d0
    0.0   29.6    0.0  988.4  0.5  4.5   16.4  153.2  10  58 c0t2d0
    0.0   26.6    0.0  924.0  0.6  5.9   23.8  220.6  13  73 c0t1d0
    0.0   35.6    0.0 1126.0  0.0  0.2    0.0    6.9   0  16 c1t5d0
    0.0   34.2    0.0 1110.0  0.0  0.2    0.0    6.7   0  15 c1t7d0
    0.0   35.4    0.0 1115.6  0.0  0.2    0.0    7.0   0  16 c1t6d0

    0.0   40.8    0.0 1306.6  3.5 10.9   85.7  267.0  52  95 c0t3d0
    0.0   39.6    0.0 1307.5  2.5 10.5   63.2  264.6  41  94 c0t2d0
    0.0   37.4    0.0 1233.8 17.9 15.0  478.0  401.0 100 100 c0t1d0
    0.0   36.8    0.0 1213.0  0.0  0.3    0.0    7.7   0  14 c1t5d0
    0.6   39.4    4.8 1285.1  0.0  0.3    0.0    8.6   0  15 c1t7d0
    0.0   37.8    0.0 1239.5  0.0  0.3    0.0    7.6   0  15 c1t6d0

    0.0   31.2    0.0  996.2  3.9  7.6  124.2  243.5  33  69 c0t3d0
    0.0   32.2    0.0  987.3  4.0  7.7  124.6  238.8  34  72 c0t2d0
    0.2   35.0    1.6 1081.7  8.5 10.8  240.4  307.3  51  89 c0t1d0
    0.0   35.2    0.0 1123.3  0.0  0.3    0.0    7.7   0  14 c1t5d0
    0.4   33.4    3.2 1056.8  0.0  0.3    0.0    7.9   0  14 c1t7d0
    0.0   36.0    0.0 1107.3  0.0  0.4    0.0   10.0   0  15 c1t6d0

    0.0   39.2    0.0 1412.8 15.8 11.8  404.3  301.1  69  84 c0t3d0
    0.0   39.8    0.0 1417.6 17.6 12.1  443.1  303.6  74  86 c0t2d0
    0.0   41.2    0.0 1447.1 33.5 14.4  814.1  350.3  95 100 c0t1d0
    0.0   32.2    0.0 1238.4  0.3  0.6    9.0   18.4   2  14 c1t5d0
    0.0   32.8    0.0 1232.9  0.3  0.6    8.9   19.2   3  14 c1t7d0
    0.0   33.2    0.0 1238.5  0.4  0.7   10.6   19.6   3  15 c1t6d0

    0.0    6.8    0.0  169.6  0.0  0.5    0.0   69.8   0   9 c0t3d0
    0.0    5.4    0.0  113.6  0.0  0.3    0.0   64.2   0   9 c0t2d0
    0.0    4.8    0.0  120.9  0.0  0.4    0.0   88.4   0  13 c0t1d0
    0.0    5.0    0.0  128.1  0.0  0.0    0.0    6.5   0   2 c1t5d0
    0.6    6.8    4.8  153.6  0.0  0.1    0.0    6.9   0   3 c1t7d0
    1.2    5.6    9.6  126.4  0.0  0.0    0.0    4.8   0   2 c1t6d0

    0.0   17.2    0.0  283.2  0.0  0.5    0.0   28.4   0  23 c0t3d0
    0.8   15.8    6.4  288.0  0.0  0.5    0.0   31.9   0  25 c0t2d0
    0.0   15.4    0.0  278.4  0.0  1.1    0.0   71.5   0  34 c0t1d0
    0.0   15.6    0.0  289.5  0.0  0.1    0.0    5.3   0   5 c1t5d0
    0.0   15.0    0.0  278.4  0.0  0.1    0.0    5.9   0   5 c1t7d0
    0.4   18.2    3.2  302.0  0.0  0.1    0.0    5.7   0   6 c1t6d0

    0.0   23.4    0.0  611.2  0.5  2.4   19.5  101.9   6  34 c0t3d0
    0.8   24.0    6.4  609.6  0.3  2.2   11.2   89.3   5  36 c0t2d0
    0.0   21.6    0.0  585.6  0.2  3.5   11.1  160.6   6  48 c0t1d0
    0.8   23.0    6.4  583.7  0.0  0.2    0.0    6.3   0   9 c1t5d0
    0.0   23.8    0.0  616.0  0.0  0.2    0.0    6.5   0   9 c1t7d0
    0.0   25.0    0.0  595.6  0.0  0.2    0.0    6.7   0  10 c1t6d0

    0.0   29.2    0.0  738.9  0.2  2.7    6.7   94.2   6  47 c0t3d0
    0.0   25.6    0.0  713.3  0.3  2.7   12.2  105.0   7  43 c0t2d0
    0.0   27.4    0.0  687.8  0.7  4.1   27.3  151.5  11  63 c0t1d0
    0.8   33.8    6.4  841.6  0.0  0.3    0.0    8.9   0  14 c1t5d0
    0.0   32.6    0.0  823.7  0.0  0.3    0.0    8.6   0  13 c1t7d0
    0.4   36.4    3.2  824.6  0.0  0.3    0.0    9.0   0  16 c1t6d0

    0.2   21.9    1.6  567.5  0.1  3.3    4.5  150.3   3  40 c0t3d0
    0.0   20.7    0.0  572.3  0.0  2.7    2.2  128.4   4  35 c0t2d0
    0.0   19.5    0.0  573.9  0.1  3.5    6.1  179.0   8  45 c0t1d0
    0.0   19.7    0.0  516.5  0.0  0.2    0.0    9.1   0   7 c1t5d0
    0.0   17.5    0.0  467.1  0.0  0.2    0.0   10.0   0   7 c1t7d0
    1.2   17.7    9.6  498.0  0.0  0.2    0.0    9.0   0   8 c1t6d0

    1.2   24.9    9.6  636.2  2.3  3.6   86.6  136.7  16  48 c0t3d0
    0.0   25.5    0.0  645.8  1.6  3.0   61.3  118.1  10  45 c0t2d0
    0.0   24.1    0.0  666.7  2.1  4.6   85.7  188.9  18  63 c0t1d0
    0.0   23.9    0.0  639.4  0.0  0.4    0.7   18.2   1   9 c1t5d0
    0.0   23.1    0.0  645.8  0.0  0.3    0.0   12.7   0   9 c1t7d0
    0.0   25.1    0.0  668.0  0.0  0.4    0.0   16.7   0  10 c1t6d0

Darren Dunham <ddun...@redwood.taos.com> wrote in message <news:91Ax8.887$T71.356074644@newssvr21.news.prodigy.com>...
> Richard E Sgrignoli <Richard.Sgrign...@highmark.com> wrote:
> > My error.....I just did a copy/paste from the first line.....yes, you
> > are right.....they should read as follows.....

> >> >      datadg01        =       c0t1d0
> >> >      datadg02        =       c0t2d0
> >> >      datadg03        =       c0t3d0
> >> >      datadg04        =       c1t5do
> >> >      datadg05        =       c1t6d0
> >> >      datadg06        =       c1t7d0

> Okay, now post a section of the iostat output.

 
 
 

Striped But Lobsided Disk Utilization.....

Post by Darren Dunha » Sat, 27 Apr 2002 04:59:11



Quote:> I did an iostat with the following options:
>      iostat -xnptc 5 | grep "c[01]t123567]d0$"
>     0.0   17.2    0.0  283.2  0.0  0.5    0.0   28.4   0  23 c0t3d0
>     0.8   15.8    6.4  288.0  0.0  0.5    0.0   31.9   0  25 c0t2d0
>     0.0   15.4    0.0  278.4  0.0  1.1    0.0   71.5   0  34 c0t1d0
>     0.0   15.6    0.0  289.5  0.0  0.1    0.0    5.3   0   5 c1t5d0
>     0.0   15.0    0.0  278.4  0.0  0.1    0.0    5.9   0   5 c1t7d0
>     0.4   18.2    3.2  302.0  0.0  0.1    0.0    5.7   0   6 c1t6d0

Okay, I've seen this sort of thing in several different ways.  Normally
it's two halves of a mirror, but you're saying this is just one big
stripe.

When I've seen this in mirrors, the "busy" half will often flip-flop.
One time it's one side, the next time its the other...  

My theory is that Veritas is scheduling lots of writes faster than these
devices (either the disks or the buses..) can keep up.  So eventually
either the cache or the controller pauses until the scheduled writes
complete.

This will never happen *exactly* at the same time everywhere in the
stripe.  Some device has to be the first to fill.  Probably it's
something to do with the c0 controller, the bus it's on, or any array
attached.  That one fills first, so Veritas has to wait for it to
finish.

As soon as it's done, Veritas schedules writes to all the disks again.
Since the other side had a fraction of a second of non-busy time, it's
able to accept the request immediately, while the "busy" side must again
queue it up and wait.  The I/O system sees this as one side taking an
egregious time to complete the request, while the other side is
instantaneous.

I think the devices are pretty close, but writing to them like this
drives one component to be the bottleneck.  The other one could have
*almost* the same performance, but since it's not the bottleneck, it
looks much better.

Quote:>     0.0   23.4    0.0  611.2  0.5  2.4   19.5  101.9   6  34 c0t3d0
>     0.8   24.0    6.4  609.6  0.3  2.2   11.2   89.3   5  36 c0t2d0
>     0.0   21.6    0.0  585.6  0.2  3.5   11.1  160.6   6  48 c0t1d0
>     0.8   23.0    6.4  583.7  0.0  0.2    0.0    6.3   0   9 c1t5d0
>     0.0   23.8    0.0  616.0  0.0  0.2    0.0    6.5   0   9 c1t7d0
>     0.0   25.0    0.0  595.6  0.0  0.2    0.0    6.7   0  10 c1t6d0

If possible, I'd do a test on the disks by themselves.  You'd need to
have some blocks available for testing outside the filesystem, but then
just do 'dd' writes to the disk.  You should be able to completely
saturate it with something like this..

dd if=/dev/zero of=/dev/rdsk/c0t3d0sx bs=64k

See if the max throughput on all the disks are similar, or if there is
something going on.  When you're scheduling all the writes to them in
sequence (as happens in the stripe), it's difficult to tell if there's a
problem with one of them or not.

--

Unix System Administrator                    Taos - The SysAdmin Company
Got some Dr Pepper?                           San Francisco, CA bay area
         < This line left intentionally blank to confuse you. >

 
 
 

1. Moving a Solstice Disk Suite-striped disk from one news server to another

I've got a news server that is running Solaris 2.5.1 and Solstice Disk
Suite 4.0 with various patches to both.  Most of the spool is on a
3-disk striped metadevice.  (Yes, I have heard that Veritas is better,
yada, yada.)

I would like to move this spool disk to another machine without having
to save to tape, metainit, newfs, and restore from tape.  How can I do
this?  I can't guarantee that on the new machine, the block devices
will have the same names, major and minor device numbers, and so
forth.

I'm guessing that I can move the 'set md:mddb_bootlist1' line in
/etc/system to the new system, and modify various files (including "do
not hand edit" files) in /etc/opt/SUNWmd, but then again mddb.cf has a
per-line checksum that I'm not sure how to compute.

Is there an easy way to do this that I have missed?

Also, is there a way to do this if I upgrade to Solaris 2.6 and SDS
4.1?

--

NASA/MSFC Flight Data Systems Branch

2. kickstart with 2 identical network cards? or kickstart hacking?

3. Solaris - stripe at O/S or stripe at EMC?

4. Extended ASCII codes with UNIX

5. Veritas striped-mirror (Striped Pro) question

6. RH 7.0 install problem - hang

7. MBus disk-read utilization and raw driver

8. Linux Compressed FileSystem (again)

9. aix disk utilization of oracle databases

10. How to get CPU, Memory, Disk Utilization on HP-UX 11.x

11. 4.3.2 upgrade to 4.3.3 iostat shows high disk utilization

12. disk utilization

13. Disk utilization