Calling semop() after signal interrupts blocking semop() call

Calling semop() after signal interrupts blocking semop() call

Post by Ken Bu » Sat, 06 Feb 1999 04:00:00



I'm experiencing a problem with semaphores when a blocking semop()
call is interrupted by a signal, and subsequent semop()
reserve/release calls are issued before re-issuing the original
(interrupted) semop() call.
It seems that the original semaphore status gets mangled by the
subsequent semop() calls after the waiting semop() call is interrupted
(see below for details).

I wrote a test program to explore variations on this problem;
it behaves the same way on various versions of AIX (including
AIX 4.3.1) and also on a SPARCstation running SunOS 5.5.1.

I'm not sure if I've overlooked some well-known rule of semaphore
programming, or if there's a flaw in the o/s.
Any ideas would be welcomed - further details (including test
program source code) available on request.

-----------------------

scenario descriptions:

general scenario that works Just Fine:
   process A is waiting on semaphore X, inside a semop() call.
   process B sends a signal to process A.
   process A diverts to its signal handler and processes the signal,
             then returns to its mainline code.  the interrupted
             semop() call returns indicating errno=EINTR, so process A
             reissues the semop() call and resumes waiting on the
             semaphore.
   eventually, process B releases the semaphore and process A returns
             successfully from the blocking semop() call.

modified scenario that Doesn't Work Fine:
   if process A is inside its signal handler and issues semop() calls
             to reserve and then release semaphore Y (a different
             semaphore than the one it was waiting on when it got
             interrupted), there is a problem.  after process A
             returns from its signal handler, notes the EINTR status
             from the interrupted semop() call, and reissues the
             semop() call, this time it gets the semaphore
             immediately, even though nobody has released it yet.
             The semaphore values indicated by semctl(GETVAL, etc.)
             at various points in the code don't indicate an obvious
             problem.

sigsetjmp/siglongjmp variation - doesn't work either:
   POSIX standards seem to suggest it may be unwise to issue
             semop() calls from within a signal handler (although it's
             not clear to me if the restriction matters if the semop()
             calls only involve _some other semaphore_ than the
             one we were waiting on).  in any case, a modified version
             of the test program was created using
             sigsetjmp/siglongjmp calls so that the additional semop()
             calls to be issued as a result of the signal are not
             performed inside the signal handler, but rather are
             performed after the signal handler returns to the
             mainline code (via siglongjmp), before re-issuing the
             interrupted semop() call.
             this does NOT change the behavior - process A still gets
             the semaphore immediately after re-issuing the
             interrupted semop() call.

--
Ken Buck

 
 
 

Calling semop() after signal interrupts blocking semop() call

Post by Raimo Kangassal » Sat, 06 Feb 1999 04:00:00


I observed similar behavior on AIX in a program that had a signal
handler for SIGCHLD; had to do away with the signal handler and
meet a deadline with quick and dirty fix.

I would very much like to have your source code in order to continue
troubleshooting the original solution.

My real e-mail address is it1 dot raimok at memo dot volvo dot se


> I'm experiencing a problem with semaphores when a blocking semop()
> call is interrupted by a signal, and subsequent semop()
> reserve/release calls are issued before re-issuing the original
> (interrupted) semop() call.
> It seems that the original semaphore status gets mangled by the
> subsequent semop() calls after the waiting semop() call is interrupted
> (see below for details).

> I wrote a test program to explore variations on this problem;
> it behaves the same way on various versions of AIX (including
> AIX 4.3.1) and also on a SPARCstation running SunOS 5.5.1.

> I'm not sure if I've overlooked some well-known rule of semaphore
> programming, or if there's a flaw in the o/s.
> Any ideas would be welcomed - further details (including test
> program source code) available on request.

> -----------------------

> scenario descriptions:

> general scenario that works Just Fine:
>    process A is waiting on semaphore X, inside a semop() call.
>    process B sends a signal to process A.
>    process A diverts to its signal handler and processes the signal,
>              then returns to its mainline code.  the interrupted
>              semop() call returns indicating errno=EINTR, so process A
>              reissues the semop() call and resumes waiting on the
>              semaphore.
>    eventually, process B releases the semaphore and process A returns
>              successfully from the blocking semop() call.

> modified scenario that Doesn't Work Fine:
>    if process A is inside its signal handler and issues semop() calls
>              to reserve and then release semaphore Y (a different
>              semaphore than the one it was waiting on when it got
>              interrupted), there is a problem.  after process A
>              returns from its signal handler, notes the EINTR status
>              from the interrupted semop() call, and reissues the
>              semop() call, this time it gets the semaphore
>              immediately, even though nobody has released it yet.
>              The semaphore values indicated by semctl(GETVAL, etc.)
>              at various points in the code don't indicate an obvious
>              problem.

> sigsetjmp/siglongjmp variation - doesn't work either:
>    POSIX standards seem to suggest it may be unwise to issue
>              semop() calls from within a signal handler (although it's
>              not clear to me if the restriction matters if the semop()
>              calls only involve _some other semaphore_ than the
>              one we were waiting on).  in any case, a modified version
>              of the test program was created using
>              sigsetjmp/siglongjmp calls so that the additional semop()
>              calls to be issued as a result of the signal are not
>              performed inside the signal handler, but rather are
>              performed after the signal handler returns to the
>              mainline code (via siglongjmp), before re-issuing the
>              interrupted semop() call.
>              this does NOT change the behavior - process A still gets
>              the semaphore immediately after re-issuing the
>              interrupted semop() call.

> --
> Ken Buck



 
 
 

Calling semop() after signal interrupts blocking semop() call

Post by Derek Viljoe » Sat, 06 Feb 1999 04:00:00



> I observed similar behavior on AIX in a program that had a signal
> handler for SIGCHLD; had to do away with the signal handler and
> meet a deadline with quick and dirty fix.

> I would very much like to have your source code in order to continue
> troubleshooting the original solution.

> My real e-mail address is it1 dot raimok at memo dot volvo dot se


> > I'm experiencing a problem with semaphores when a blocking semop()
> > call is interrupted by a signal, and subsequent semop()
> > reserve/release calls are issued before re-issuing the original
> > (interrupted) semop() call.

You'll always get this kind of behavior when you mix signals with
lightweight IPC's (ie - condition variables, semaphores, etc.).  I
believe that the Solaris (2.5) man pages on condition variables explains
spurious wakeups due to signals at great length.  They (at least in the
cond_wait case) always recommend retesting the condition after waking
up.  I don't know what to say about semaphores specifically.  Have you
checked the man pages?

Derek Viljoen

 
 
 

Calling semop() after signal interrupts blocking semop() call

Post by Raimo Kangassal » Sat, 06 Feb 1999 04:00:00




> > I observed similar behavior on AIX in a program that had a signal
> > handler for SIGCHLD; had to do away with the signal handler and
> > meet a deadline with quick and dirty fix.

> > I would very much like to have your source code in order to continue
> > troubleshooting the original solution.

> > My real e-mail address is it1 dot raimok at memo dot volvo dot se


> > > I'm experiencing a problem with semaphores when a blocking semop()
> > > call is interrupted by a signal, and subsequent semop()
> > > reserve/release calls are issued before re-issuing the original
> > > (interrupted) semop() call.

> You'll always get this kind of behavior when you mix signals with
> lightweight IPC's (ie - condition variables, semaphores, etc.).  I
> believe that the Solaris (2.5) man pages on condition variables explains
> spurious wakeups due to signals at great length.  They (at least in the
> cond_wait case) always recommend retesting the condition after waking
> up.  I don't know what to say about semaphores specifically.  Have you
> checked the man pages?

I browsed the manual pages (on AIX) that I thought were relevant, never
the ones on Solaris, will check them.

Thanks for the pointer!

- Show quoted text -

Quote:

> Derek Viljoen

 
 
 

Calling semop() after signal interrupts blocking semop() call

Post by Bob Rubenduns » Sat, 06 Feb 1999 04:00:00



> I'm experiencing a problem with semaphores when a blocking semop()
> call is interrupted by a signal, and subsequent semop()
> reserve/release calls are issued before re-issuing the original
> (interrupted) semop() call.

/* wrap_semop loops a semop operation until its results are
not EINTR, which results from the user pressing control-C.
*/
int wrap_semop(int sid, struct sembuf *s, int nop)
{
int res;
while ((res = semop(sid, s, nop)))
        if (errno != EINTR)
                break;
return res;

Quote:}

This works for me.
--
-Bob Rubendunst
Soft Machines
Autolog Communications software for AIX, AMOS, DOS, Windows, and SCO
Unix
Phone: (217) 351-7199   Fax: (217) 351-2629
http://www.softm.com
 
 
 

Calling semop() after signal interrupts blocking semop() call

Post by Andrew Giert » Sun, 07 Feb 1999 04:00:00


 Derek> You'll always get this kind of behavior when you mix signals with
 Derek> lightweight IPC's (ie - condition variables, semaphores, etc.).

He's talking about SysV semaphores, which should behave correctly
whether signals are taken or not.

pthread condition variables are a whole other kettle of fish.

--
Andrew.

comp.unix.programmer FAQ: see <URL: http://www.erlenstar.demon.co.uk/unix/>
                           or <URL: http://www.whitefang.com/unix/>

 
 
 

Calling semop() after signal interrupts blocking semop() call

Post by Andrew Giert » Sun, 07 Feb 1999 04:00:00


 Ken> I'm experiencing a problem with semaphores when a blocking
 Ken> semop() call is interrupted by a signal, and subsequent semop()
 Ken> reserve/release calls are issued before re-issuing the original
 Ken> (interrupted) semop() call.  It seems that the original
 Ken> semaphore status gets mangled by the subsequent semop() calls
 Ken> after the waiting semop() call is interrupted (see below for
 Ken> details).

 Ken> I'm not sure if I've overlooked some well-known rule of semaphore
 Ken> programming, or if there's a flaw in the o/s.

I'd suspect a fault in the code, rather than one in the O/S :-)

A suggestion:

what happens if you use SA_RESTART when catching signals? Does the
problem persist?

--
Andrew.

comp.unix.programmer FAQ: see <URL: http://www.erlenstar.demon.co.uk/unix/>
                           or <URL: http://www.whitefang.com/unix/>