RS6000 almost stops with heavy disk access.

RS6000 almost stops with heavy disk access.

Post by Joao Luis Silva Dam » Thu, 19 Nov 1992 08:58:47



Hi,
we have two rs6000 (a 320H and 340) being used by different kinds of users.
Some of them do large molecular calculations which create HUGE (600-700 MB) tem
porary files for the molecular integrals. When their program begins writing the
se huge files and then reading them as needed, our machines almost stop and oth
er users are almost unable to use the system. The slowness means 30-60 sec to l
ogon, 1 minute for X windows to create a new window, a long time for even an ls
, etc. Most other users do a lot of editing so they actually need better respon
se time from the machine.
Is there anyway to correct this situation (besides killing those huge calculati
ons, of course)?
Thanks for your answers,
                      Joao Damas
 
 
 

RS6000 almost stops with heavy disk access.

Post by Konrad Haeden » Thu, 19 Nov 1992 16:38:06



|> Hi,
|> we have two rs6000 (a 320H and 340) being used by different kinds of users.
|> Some of them do large molecular calculations which create HUGE (600-700 MB) tem
|> porary files for the molecular integrals. When their program begins writing the
|> se huge files and then reading them as needed, our machines almost stop and oth
|> er users are almost unable to use the system. The slowness means 30-60 sec to l
|> ogon, 1 minute for X windows to create a new window, a long time for even an ls
|> , etc. Most other users do a lot of editing so they actually need better respon
|> se time from the machine.
|> Is there anyway to correct this situation (besides killing those huge calculati
|> ons, of course)?
|> Thanks for your answers,
|>                       Joao Damas

Run those processes at a fixed priority of 126 or slightly smaller;
with dynamic priorities and your situtation, the niceness scale is
inadequate.
--
Konrad Haedener                             Phone: +41 31 65 42 25
Institute for Physical Chemistry            FAX:   +41 31 65 39 94
University of Berne


 
 
 

RS6000 almost stops with heavy disk access.

Post by Rudy E. Chukr » Fri, 20 Nov 1992 16:01:06



Quote:>Hi,
>we have two rs6000 (a 320H and 340) being used by different kinds of users.
>Some of them do large molecular calculations which create HUGE (600-700 MB) tem
>porary files for the molecular integrals. When their program begins writing the
>se huge files and then reading them as needed, our machines almost stop and oth
>er users are almost unable to use the system. The slowness means 30-60 sec to l
>ogon, 1 minute for X windows to create a new window, a long time for even an ls
>, etc. Most other users do a lot of editing so they actually need better respon
>se time from the machine.
>Is there anyway to correct this situation (besides killing those huge calculati
>ons, of course)?

Lower the priority of the cruncher programs with nice or renice.

Look into using the I/O write queue length limitation feature. Ther is
a panel in SMIT to do this. However, describing how to do it here is quite
difficult.  Id suggest playing with some nonzero values and see if that helps.
Cant give you more info since Im not in front of my 6000 right now.

Add more memory.

Add faster disks or more disk arms and rearrange the files on those drives.

 
 
 

RS6000 almost stops with heavy disk access.

Post by Martin Schue » Fri, 20 Nov 1992 18:02:37



|> Hi,
|> we have two rs6000 (a 320H and 340) being used by different kinds of users.
|> Some of them do large molecular calculations which create HUGE (600-700 MB) tem
|> porary files for the molecular integrals. When their program begins writing the
|> se huge files and then reading them as needed, our machines almost stop and oth
|> er users are almost unable to use the system. The slowness means 30-60 sec to l
|> ogon, 1 minute for X windows to create a new window, a long time for even an ls
|> , etc. Most other users do a lot of editing so they actually need better respon
|> se time from the machine.
|> Is there anyway to correct this situation (besides killing those huge calculati
|> ons, of course)?
|> Thanks for your answers,
|>                       Joao Damas

1) Consider installation of NQS with fixed priority patch
   (obtainable e.g. from this site)
2) If these large, molecular calculations are Gaussian 9x Hartree-Fock
   ab-initio calculations, use the direct SCF method. This reduces
    disk-access considerably and is faster on the rs6k.

Hope that helps...
--
Martin Schuetz                              Phone: +41 31 65 42 40
Institute for Physical Chemistry            FAX:   +41 31 65 44 99
University of Berne

 
 
 

RS6000 almost stops with heavy disk access.

Post by J.Ro » Sat, 21 Nov 1992 03:44:12



Quote:>   Lower the priority of the cruncher programs with nice or renice.

Unlikely to help very much. IO won't be effected and the crunchers will
just slurp up any spare cpu time. You're not competing for cpu but IO.

Quote:>   Look into using the I/O write queue length limitation feature. Ther is
>   a panel in SMIT to do this. However, describing how to do it here is quite
>   difficult.  Id suggest playing with some nonzero values and see if that
>   helps.

This sounds very interesting. Roughly how do I get to this panel?
I can't see it on my 3.1.5 system and I certainly would like to!
Alternatively, what's the command name?

John Rowe
Dept Physics
Exeter University
UK

 
 
 

RS6000 almost stops with heavy disk access.

Post by L. Scott Emmo » Sat, 21 Nov 1992 04:20:57



Quote:>Hi,
>we have two rs6000 (a 320H and 340) being used by different kinds of users.
>Some of them do large molecular calculations which create HUGE (600700 MB) tem
>porary files for the molecular integrals. When their program begins writing the
>se huge files and then reading them as needed, our machines almost stop and oth
>er users are almost unable to use the system. The slowness means 3060 sec to l
>ogon, 1 minute for X windows to create a new window, a long time for even an ls
>, etc. Most other users do a lot of editing so they actually need better respon
>se time from the machine.
>Is there anyway to correct this situation (besides killing those huge calculati
>ons, of course)?

Try turning on I/O Pacing, which is accessed via "smit system"
fastpath, choose the "characterstics of the operating system" option.
If you ask for help in the fields, it will give some information. We
use 17 for the highwater mark and 5 for the lowwater mark (IBM
suggested "startingpoint" values). Values for the highwater mark
should be (according to IBM) along the formula '4*(num pages)+1'. The
"AIX Tuning Gude" (IBM Publication SC23236501) contains further
information about I/O Pacing, which is basically used to prevent large
I/O operations from monopolizing the CPU. Overall system performance
goes down when using this feature; it is used (in general) where you
want to favor interactive users over long I/O bound jobs.

                           L. Scott Emmons
                      CableData  Research Center
                     csusac.csus.edu!cdsac!scotte
                                KC6NFP

 
 
 

RS6000 almost stops with heavy disk access.

Post by Adrian Wo » Thu, 26 Nov 1992 10:23:47


I was under the impression that the kernel *raises* the priority of
I/O pending processes because they are waiting for hardware interupts,
therefore the "nice" value has little or no effect on I/O bound
processes.

Our remedy for heavy I/O bound quantum chemical jobs is to have dedicated
workstations (320s) that run just that process (with no logins or competing
processes). We greatly enhance the throughput of work done as opposed to
running processes simultaneously. All interactive work is done on the
faster machines that also run "lightweight" simulations etc.

--
-----------------------------------------------------------------------------

University of Sydney NSW 2006 Australia        061-2-692 4137

 
 
 

1. heavy Disk I/O and system stops reacting for seconds

Hi there,

I think someone else notices this too, but anyway, i write down my
experiences.

I've tested 2.4.19rc[1|2|3], AC tree, AA tree, jam tree and mjc tree
All of them shows up the same behaviour. If i do some disk i/o, f.e.:

tar xzpf linux-2.4.18.tar.gz; rm -rf linux-2.4.18

the system stopps reacting while untar/ungzipping the file for more than 5
seconds. Nothing but the mouse reacts. This does NOT occur with 2.4.18 and
early 2.4.19-pre's ...

System is a Celeron 800MHz, 256 MB RAM, EIDE UDMA100 Intel BX440 running ext3
filesystem.

If you need more informations tell me what and I provide them.
If this is already fixed by someone, please tell me :-)

Please CC, i am not subscribed!

--
Kind regards
        Marc-Christian Petersen

http://sourceforge.net/projects/wolk

PGP/GnuPG Key: 1024D/408B2D54947750EC
Fingerprint: 8602 69E0 A9C2 A509 8661 2B0B 408B 2D54 9477 50EC
Key available at www.keyserver.net. Encrypted e-mail preferred.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

2. PPP: kdebug messages: can't enable dumping packets.

3. severe slowdown with 2.4 series w/heavy disk access

4. URGENT ACUSHARE WON'T RUN AFTER REBOOT

5. freeze during heavy (SCSI) disk access

6. LaserJet 6L w/ RH 5.2

7. severe slowdown with 2.4 series w/heavy disk access (revisited)

8. Apache and Tomcat4 plug-in: Further Adventures

9. HP Netserver LD PRO 200 freezes under heavy disk access

10. Almost dead machine with heavy swapping

11. How can we stop the permanently hard disk access under linux

12. networking stops for 1hour with heavy load and masq

13. ISDN adapter stops responding, during heavy transfers...?