SIGIOT/waitpid()/wait3() Question

SIGIOT/waitpid()/wait3() Question

Post by S.A. Rush SYS » Sat, 20 Mar 1993 00:10:01



  I'm afraid this is a little long, but here goes ...

  I have been writing a farmer program that reads job submissions from
users and spreads them over a number of DecStations for execution.
Each DecStation may execute up to 2 jobs, other jobs being stored in a
queue waiting for a machine to become free. This is all handled by a
queue manager process which runs on a central machine while a farmer
process runs on each of the other machines.

  The farmer process receives jobs from the central manager, forks a
child to execute the job and performs a wait() to wait for the child
process to terminate.

  One extra facility is that user jobs may be sent to sleep (using a
killpg(pid, SIGSTOP)) if higher priority jobs are submitted they are
awakened (killpg(pid, SIGCONT)) when the high priority job completes.
Executing jobs may also be killed (killpg(pid, SIGKILL)) if necessary.

  Now for the problem ...

  Originally, the program used a waitpid(pid, &status, WUNTRACED) call
to wait for the child process to terminate, as this allows detection of
processes being sent to sleep. This seemed to work correctly and user
processes may start, stop, be sent to sleep and be killed and everything
seemed hunky dorey.

  I am now adding some accounting to the program so that it records
the usage of each user/group.

  This seemed a simple change, simply replacing the waitpid() call
with a wait3() call and a quick check to see that the pid returned is
indeed the expected one. This appears to work correctly for programs
which terminate normally or which are killed, but not when they are
sent to sleep. Such jobs happily go to sleep, but when they are
awakened again they terminate with a SIGIOT signal.

  Can anyone explain what this is and why it is generated? I assume
that I could always catch this signal and ignore it, but I am curious
as to why it is generated and what it means! I have tried looking in
the manual pages/reference books but they simply say "I/O Trap" which
does not seem very helpful.

   Does anybody have any suggestions?

        Thanks very much

              Steve R
--
---------------------------------------------------------------------
S.A.Rush                           School of Information Systems,

Tel: (0603) 56161 ext 2308         Norwich NR4 7TJ, England.

 
 
 

1. workaround? no wait3(2) or waitpid(2) -- argh!

I have an ancient Unix machine I'm trying to port modern software
to.  The system does not have either wait3(2) or waitpid(2) functions.
These appear difficult to steal from other systems' C library
source, especially since the system I am working with is System V
Release 1 -- no ANSI, no POSIX.

Hopeless?  Kludgeable?

All I have to work with is wait(2):

     NAME
          wait - wait for child process to stop or terminate

     SYNOPSIS
          int wait (stat_loc)
          int *stat_loc;

          int wait ((int *)0)

     DESCRIPTION
          Wait suspends the calling process until it receives a signal
          that  is  to  be caught (see signal(2)), or until any one of
          the calling process's child processes stops in a trace  mode
          (see  ptrace(2))  or terminates.  If a child process stopped
          or terminated prior to the call on wait, return  is  immedi-
          ate.

--
Paul Southworth

2. Trident 9440

3. question about waitpid

4. Shell Help

5. What is SIGIOT (signal 6)?

6. Where is alphalinux (aka linuxalpha)

7. waitpid question

8. Q: 1152x900x16bpp with Diamond Stealth 64 DRAM (2Mb)?

9. SIGIOT - What is it exactly?

10. wait(), waitpid question?

11. UNP Book Question - waitpid in SIGCHLD signal handler

12. configure: error: I give up -- neither wait nor wait3 works properly

13. wait3 on Solaris2.4