batch jobs and shells on KSR1

batch jobs and shells on KSR1

Post by Simon Gibs » Fri, 21 Mar 1997 04:00:00



I am running batch jobs under NQS on a KSR1 by launching a shell and
running my program:

myscript:

#! /usr/bin/csh
cd ~myhomedirectory
myprog

The job runs okay, with 'ps' reporting the following:

       F STAT   UID   PID  PPID PRI  NI  RSS WCHAN  .. COMMAND
80208001 S     8275  6633  6629 -13   0   0K sigsus .. csh myscript
80008001 R     8275  6635  6633  -7   0   0K -      .. myprog

Sometimes, I find that the shell hangs as my program exits, meaning that
the batch queue cannot continue to run any more jobs:

       F STAT   UID   PID  PPID PRI  NI  RSS WCHAN  .. COMMAND
80208001 I     8275  6633  6629 -13   0   0K sigsus .. csh myscript
80008101   <   8275  6635  6633 -25  -1   0K -      .. <defunct>

Since I am not very experienced with unix programming, I need some help to
understand what's going on. As I understand it at the moment, the shell is
failing to catch the SIGCHLD signal that is generated when my program
exits. Is this correct? and if so, is there anything I can do to ensure
that the shell exits and allows the next batch job to run?

Any help much appreciated.

Simon

--
------------------------------------------------
Simon Gibson

tel +161 44 275 6141
fax +161 44 275 6236
http://www.cs.man.ac.uk/aig/students/gibson.html
------------------------------------------------

 
 
 

1. batch jobs and shells on KSR1

I am running batch jobs under NQS on a KSR1 by launching a shell and
running my program:

myscript:

#! /usr/bin/csh
cd ~myhomedirectory
myprog

The job runs okay, with 'ps' reporting the following:

       F STAT   UID   PID  PPID PRI  NI  RSS WCHAN  .. COMMAND
80208001 S     8275  6633  6629 -13   0   0K sigsus .. csh myscript
80008001 R     8275  6635  6633  -7   0   0K -      .. myprog

Sometimes, I find that the shell hangs as my program exits, meaning that
the batch queue cannot continue to run any more jobs:

       F STAT   UID   PID  PPID PRI  NI  RSS WCHAN  .. COMMAND
80208001 I     8275  6633  6629 -13   0   0K sigsus .. csh myscript
80008101   <   8275  6635  6633 -25  -1   0K -      .. <defunct>

Since I am not very experienced with unix programming, I need some help to
understand what's going on. As I understand it at the moment, the shell is
failing to catch the SIGCHLD signal that is generated when my program
exits. Is this correct? and if so, is there anything I can do to ensure
that the shell exits and allows the next batch job to run?

Any help much appreciated.

Simon

--
------------------------------------------------
Simon Gibson

tel +161 44 275 6141
fax +161 44 275 6236
http://www.cs.man.ac.uk/aig/students/gibson.html
------------------------------------------------

2. Kernel parameters

3. bug in at/batch causes batch jobs to be truncated

4. HELP!!! Can't get Linux to see my CD-ROM

5. Job number on batch jobs using printing subsystem

6. what's the file i can change so the internet works on netscape

7. Batch jobs package - interactive job manipulation

8. Terminal Server for Sun Serial Comm

9. Jobs, Jobs, Jobs And More Jobs!!!