Can Linux do this ? ( Track I/O per process)

Can Linux do this ? ( Track I/O per process)

Post by Morten Lang » Thu, 26 Sep 2002 01:07:23



Hi,

I am a fairly devote Linux-user ,  but  I  have some things
even in  the server-realm is possible in winndows but not in Linux.
I have at least  not found out how I can perform one particular
piece of monitoring  of my Linux servers that is very easy to do on
NT 5 / w2k servers:

Using the  w2k "task manager" and choosing to display columns
for I/O read and I/O write  counted in bytes I can see
exactly which process is causing a  I/O bottleneck.
I have seen quite a few times that I/O is the bottleneck on certain
Linux servers, but I am not able to pinpoint the culprit.

I have tried sar ( comes with sysstat)  and looking at the /proc/$PID/status
info,  but to no avail.  

I am sorry but Redmond/M$ scores a point here.

( As far as I know I need the glance tool at an extra cost in HP-UX
to be able  to see this )

Regards,
Morten

 
 
 

Can Linux do this ? ( Track I/O per process)

Post by rapska » Thu, 26 Sep 2002 02:56:55


Error log for Tue, 24 Sep 2002 23:07:23 +0000, segfault in module "Morten
Lange": dump details are as follows...

Quote:> Hi,

> I am a fairly devote Linux-user ,  but  I  have some things even in  the
> server-realm is possible in winndows but not in Linux. I have at least
> not found out how I can perform one particular piece of monitoring  of
> my Linux servers that is very easy to do on NT 5 / w2k servers:

> Using the  w2k "task manager" and choosing to display columns for I/O
> read and I/O write  counted in bytes I can see exactly which process is
> causing a  I/O bottleneck. I have seen quite a few times that I/O is the
> bottleneck on certain Linux servers, but I am not able to pinpoint the
> culprit.

> I have tried sar ( comes with sysstat)  and looking at the
> /proc/$PID/status info,  but to no avail.

> I am sorry but Redmond/M$ scores a point here.

> ( As far as I know I need the glance tool at an extra cost in HP-UX to
> be able  to see this )

I don't think that there is so much a lack here on Linux's part as the
differences in how Windows and Linux do things.

The 'top' utility will give you almost realtime stats on every running
process.  Graphical frontends like gtop provide easy to read concise and
detailed information about memory and vmem usage in a GUI format.  There
is also 'gkrellm' which is a highly configurable handy little montioring
applet that docks itself on your screen.  It can provide an "at-a-glance"
summary of the state many aspects of your system, including disk I/O and
CPU usage.

For determining whether there is truly a disk I/O bottleneck on your 'nix
box, the 'vmstat' and 'procinfo' tools are very handy.  To get near
realtime values, you can run:

    $ procinfo -fdn1

This will display up to the second info on read/writes to various devices
so you can determine whether there is a bottleneck somewhere.

Also, you can run vmstat so it will print a one line total at specified
intervals like so:

   $ vmstat <num. of seconds>

This will give you a scrolling columnized report that you can use to
determine whether there are I/O and/or other issues.

Once you have determined that there is indeed a bottleneck somewhere, you
can use the top utility I mentioned earlier to attempt locate the culprit.
 Alternately, you could use the "ps eo <format options>" to display a
custom formatted listing of process info to more easily help you determine
the source of the problem.  'man ps' for more info on how this is done.

Finally, and this is somthing that just has M$ beat competely, you can
throw all of these together in a looped script, formatted exactly how you
want, and set it to run in an xterm window.  This way you always have easy
access to all of the information that you want, exactly how you want.

Your post demonstrates just how Windows makes it's users so dependant on
the specific tools and utilities that are available for it.  So when
coming over to 'nix, they waste their time looking for the exact replica
of app X instead of reproducing the *functionality* of app X with the
myriad tools and utilities already provided.

I know this, because I still do it on occasion.

;-)

--
rapskat -   8:15pm  up 1 day,  3:48,  1 user,  load average: 0.09, 0.18, 0.09
69 processes: 66 sleeping, 3 running, 0 zombie, 0 stopped
CPU states:  3.1% user,  1.1% system,  0.0% nice,  1.2% idle

You mean you didn't *know* she was off making lots of little phone companies?

 
 
 

Can Linux do this ? ( Track I/O per process)

Post by kosh » Thu, 26 Sep 2002 19:43:51



> Hi,

> I am a fairly devote Linux-user ,  but  I  have some things
> even in  the server-realm is possible in winndows but not in Linux.
> I have at least  not found out how I can perform one particular
> piece of monitoring  of my Linux servers that is very easy to do on
> NT 5 / w2k servers:

> Using the  w2k "task manager" and choosing to display columns
> for I/O read and I/O write  counted in bytes I can see
> exactly which process is causing a  I/O bottleneck.
> I have seen quite a few times that I/O is the bottleneck on certain
> Linux servers, but I am not able to pinpoint the culprit.

> I have tried sar ( comes with sysstat)  and looking at the
> /proc/$PID/status
> info,  but to no avail.

> I am sorry but Redmond/M$ scores a point here.

> ( As far as I know I need the glance tool at an extra cost in HP-UX
> to be able  to see this )

> Regards,
> Morten

Run top and look at the process status. A D is io bound at that point in
time. That is what I have often done at least and I know there are other
ways to do it also using vmstat.
 
 
 

Can Linux do this ? ( Track I/O per process)

Post by Nico Coetze » Thu, 26 Sep 2002 19:44:32


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Then use RRDTOOL to build a nice little database to store 2 years of
data ( 5 minute intervals ). Total size required: less then 1MB - web
pages and images included.


> Error log for Tue, 24 Sep 2002 23:07:23 +0000, segfault in module "Morten
> Lange": dump details are as follows...

>>Hi,

>>I am a fairly devote Linux-user ,  but  I  have some things even in  the
>>server-realm is possible in winndows but not in Linux. I have at least
>>not found out how I can perform one particular piece of monitoring  of
>>my Linux servers that is very easy to do on NT 5 / w2k servers:

>>Using the  w2k "task manager" and choosing to display columns for I/O
>>read and I/O write  counted in bytes I can see exactly which process is
>>causing a  I/O bottleneck. I have seen quite a few times that I/O is the
>>bottleneck on certain Linux servers, but I am not able to pinpoint the
>>culprit.

>>I have tried sar ( comes with sysstat)  and looking at the
>>/proc/$PID/status info,  but to no avail.

>>I am sorry but Redmond/M$ scores a point here.

>>( As far as I know I need the glance tool at an extra cost in HP-UX to
>>be able  to see this )

> I don't think that there is so much a lack here on Linux's part as the
> differences in how Windows and Linux do things.

> The 'top' utility will give you almost realtime stats on every running
> process.  Graphical frontends like gtop provide easy to read concise and
> detailed information about memory and vmem usage in a GUI format.  There
> is also 'gkrellm' which is a highly configurable handy little montioring
> applet that docks itself on your screen.  It can provide an "at-a-glance"
> summary of the state many aspects of your system, including disk I/O and
> CPU usage.

> For determining whether there is truly a disk I/O bottleneck on your 'nix
> box, the 'vmstat' and 'procinfo' tools are very handy.  To get near
> realtime values, you can run:

>     $ procinfo -fdn1

> This will display up to the second info on read/writes to various devices
> so you can determine whether there is a bottleneck somewhere.

> Also, you can run vmstat so it will print a one line total at specified
> intervals like so:

>    $ vmstat <num. of seconds>

> This will give you a scrolling columnized report that you can use to
> determine whether there are I/O and/or other issues.

> Once you have determined that there is indeed a bottleneck somewhere, you
> can use the top utility I mentioned earlier to attempt locate the culprit.
>  Alternately, you could use the "ps eo <format options>" to display a
> custom formatted listing of process info to more easily help you determine
> the source of the problem.  'man ps' for more info on how this is done.

> Finally, and this is somthing that just has M$ beat competely, you can
> throw all of these together in a looped script, formatted exactly how you
> want, and set it to run in an xterm window.  This way you always have easy
> access to all of the information that you want, exactly how you want.

> Your post demonstrates just how Windows makes it's users so dependant on
> the specific tools and utilities that are available for it.  So when
> coming over to 'nix, they waste their time looking for the exact replica
> of app X instead of reproducing the *functionality* of app X with the
> myriad tools and utilities already provided.

> I know this, because I still do it on occasion.

> ;-)

- --
Nico Coetzee
GPG Public Key   : http://www.itfirms.co.za/gpg.abuse.key
GPG Finger Print : F56C 4672 4340 3361 A10F  299A 6549 1771 9340 352B
</end>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE9kfYAZUkXcZNANSsRAklkAKDkDJIxmeuvYZwQZLIeEam1eqqtJQCeNpvr
o3plqFxdFx88rPW7AByiBvg=
=xNl+
-----END PGP SIGNATURE-----

 
 
 

Can Linux do this ? ( Track I/O per process)

Post by Tuomo Takkul » Sat, 28 Sep 2002 21:21:35



> Hi,

> I am a fairly devote Linux-user ,  but  I  have some things
> even in  the server-realm is possible in winndows but not in Linux.
> I have at least  not found out how I can perform one particular
> piece of monitoring  of my Linux servers that is very easy to do on
> NT 5 / w2k servers:

> Using the  w2k "task manager" and choosing to display columns
> for I/O read and I/O write  counted in bytes I can see
> exactly which process is causing a  I/O bottleneck.
> I have seen quite a few times that I/O is the bottleneck on certain
> Linux servers, but I am not able to pinpoint the culprit.

> I have tried sar ( comes with sysstat)  and looking at the /proc/$PID/status
> info,  but to no avail.  

> I am sorry but Redmond/M$ scores a point here.

Because you didn't do your research very well.

The interface used to be the rusage (1) system call, which gives you
all information that you need for a process, and then some. One of its
uses is for recording usage of user processor cycles in order to send
proper bills for computing time.

IIRC rusage is now superseded by the proc (1) interface, and things
might still look a bit different on different unices, but not as much
as they used to.

If you don't want to write something down on that lever yourself (it's
not that difficult), the straightforward interface is ps and
top. Especially the former should be able to give you reports about
all end everything. How much of that is visible with the GUI frontends
is not so clear.

Quote:> ( As far as I know I need the glance tool at an extra cost in HP-UX
> to be able  to see this )

... or code something with Tcl/Tk around ps in an afternoon...

        Cheers
        Tuomo

--
___
   "Microsoft OS's are good because they encourage Intel to produce
    faster CPUs for the rest of us to run Unix on."
                                                         George Dau

 
 
 

Can Linux do this ? ( Track I/O per process)

Post by Mike » Sat, 28 Sep 2002 23:45:19




   Date: 27 Sep 2002 21:21:35 +0200
Partial reproduction follows:

Quote:> The interface used to be the rusage (1) system call, [...]

Since when were system calls in section 1? ;P
--
Mike.   Remove "-spam" to mail me.  Better yet, don't mail me. ;-)
 
 
 

Can Linux do this ? ( Track I/O per process)

Post by Morten Lang » Tue, 01 Oct 2002 18:34:31


Hi,  and thanks for your replies.

I have tried procinfo and top ( looking for prcesses in the D-state),
but I can't  find any info about   where the bottleneck is.
It thus seems that this is not an I/O problem after all.

One sympthom of  the presence of a bottleneck on my server,  
is that it almost invariably takes 7 seconds before top displays its first page.  
Could it be that the kernel must be tuned to use
4GB of memory efficiently   (as in RedHat Advanced Server)

I guess this thread   should be moved to another newsgroup (?)
I set follwup to comp.os.linux.misc

Below you see some "screenshots" from procinfo and top.

- Morten



> Hi,

> I am a fairly devote Linux-user ,  but  I  have some things even in  the
> server-realm is possible in winndows but not in Linux. I have at least
> not found out how I can perform one particular piece of monitoring  of
> my Linux servers that is very easy to do on NT 5 / w2k servers:

> Using the  w2k "task manager" and choosing to display columns for I/O
> read and I/O write  counted in bytes I can see exactly which process is
> causing a  I/O bottleneck. I have seen quite a few times that I/O is the
> bottleneck on certain Linux servers, but I am not able to pinpoint the
> culprit.

> I have tried sar ( comes with sysstat)  and looking at the
> /proc/$PID/status info,  but to no avail.

> I am sorry but Redmond/M$ scores a point here.

> ( As far as I know I need the glance tool at an extra cost in HP-UX to
> be able  to see this )

> Regards,
> Morten


#1 SMP Tue Feb 26 06:25:36 EST 2002                   2 CPU
Memory:      Total        Used        Free      Shared     Buffers      Cached
Mem:             0           0           0           0           0           4
Swap:            0           0           0

Bootup: Tue Sep 10 22:26:18 2002    Load average: 0.18 0.22 0.21 2/274 4691

user  :       0:00:00.64  31.7%  page in :        0  disk 1:        0r       0w
nice  :       0:00:00.00   0.0%  page out:        0  disk 2:        0r       0w
system:       0:00:00.11   5.4%  swap in :        0  disk 3:        0r       0w
idle  :       0:00:01.27  62.8%  swap out:        0
uptime:  19d 16:36:01.09         context :     68926

irq  0:       101 timer                 irq 14:         0 ide0                
irq  1:         0 keyboard              irq 22:      1700 eth0                
irq  2:         0 cascade [4]           irq 24:         0 sym53c8xx            
irq  6:         0                       irq 25:         0 sym53c8xx            
irq  8:         0 rtc                   irq 33:         0 0            none  u
irq 12:         0 PS/2 Mouse          

===============================
#### procinfo (totals )



#1 SMP Tue Feb 26 06:25:36 EST 2002        2 CPU

Memory:      Total        Used        Free      Shared     Buffers      Cached
Mem:       4040612     3508896      531716          64      757688     1867616
Swap:      2097128           0     2097128

Bootup: Tue Sep 10 22:26:19 2002    Load average: 0.02 0.09 0.14 1/272 4999

user  :   1d  9:54:05.39   3.6%  page in : 11956985  disk 1:        1r       0w
nice  :       0:01:08.92   0.0%  page out: 15519310  disk 2:   914081r 2043672w
system:      23:04:46.96   2.4%  swap in :        1  disk 3:     3066r       0w
idle  :  37d  1:46:56.75  94.0%  swap out:        0
uptime:  19d 17:23:28.98         context :1507659201

irq  0: 170420901 timer                 irq 14:        28 ide0                
irq  1:        83 keyboard              irq 22: 383546902 eth0                
irq  2:         0 cascade [4]           irq 24:   2945991 sym53c8xx            
irq  6:         4                       irq 25:        30 sym53c8xx            
irq  8:         1 rtc                   irq 33:         0 0            none  u
irq 12:        42 PS/2 Mouse          

===============================

top -b -n 1 |   head -20  

  2:59pm  up 19 days, 16:33,  5 users,  load average: 0,24, 0,28, 0,23
271 processes: 270 sleeping, 1 running, 0 zombie, 0 stopped
CPU0 states:  5,0% user, 43,2% system,  0,0% nice, 51,1% idle
CPU1 states:  4,1% user, 29,1% system,  0,0% nice, 66,0% idle
Mem:  4040612K av, 3501620K used,  538992K free,      64K shrd,  756868K buff
Swap: 2097128K av,       0K used, 2097128K free                 1862208K cached

  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
 4649 root      14   0  1252 1248   820 R    72,5  0,0   0:05 top
 1405 k4590k     9   0  704M 704M 40000 S     2,0 17,8  34:41 java
 2217 k4590k     9   0  704M 704M 40000 S     1,7 17,8   0:44 java
 1762 k4590k     9   0  704M 704M 40000 S     1,2 17,8   0:40 java
 2110 k4590k     9   0  704M 704M 40000 S     1,2 17,8   0:49 java
 1653 k4590k     9   0  704M 704M 40000 S     0,5 17,8   0:50 java
 2191 k4590k     9   0  704M 704M 40000 S     0,5 17,8   0:47 java
 1407 k4590k     9   0  704M 704M 40000 S     0,2 17,8   0:01 java
 1411 k4590k     9   0  704M 704M 40000 S     0,2 17,8   2:33 java
 1484 apache     9   0  7104 7104  6148 S     0,2  0,1   0:01 httpd

Well I actually tried to genrate I/O to see if it would apear with a "D"
in top and it worked :   ( the dd   process )


[1] 28496

  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
28496 root      16   0   104   40    24 D     6,1  0,0   0:03 dd

- Morten

 
 
 

1. Per-process/per-system File Descriptor Limits?

I apologize in advance for my really retarded question; I've
checked the FAQ, looked at the online info, checked out
various options under SMIT, etc. It's also been years and
years since I've touched an AIX box, and now it's suddenly
an issue.

What are the default limits on per-process and system-wide
open file descriptors? How does one change them? The system
that I'm using seems to have about 2K descriptors per process
(which is more than enough), but I don't know if it came that
way out of the box, or if things were somehow tweaked to be
that way.

Thanks in advance,

Douglas Barnes
C2Net

2. EIDE Drive SLOW, Apollo MVP3 AGP/PCIset, Redhat 6.0

3. Per process, per user network bandwidth limiter

4. passwd file locked

5. Tracking Internet Usage per user through a linux gateway?

6. 540MB DRIVE QUESTION

7. How to limit cpu time per process per user account

8. Not able to rsh (and can't do CVS)

9. How to see memory used per user; memory used per process

10. Process utilization per process

11. Tracking bytes out per user..

12. RFC: Linux Bug Tracking & Feature Tracking DB