pipe write() blocks even though select() indicates writable

pipe write() blocks even though select() indicates writable

Post by Rich Gra » Sat, 06 Jan 2001 00:38:38



Howdy folks,

I'm having trouble dealing with passing data through a child filter
process.   My app is sending info to a socket, with an optional filter
sub process.   The write routine either just passes output through to
the socket write or writes it down the pipe to the child and reads the
data from the child and writes it on to the socket.

  output ----------------------------> socket

or
                 wfd     rfd
  output ---------+       +----------> socket
                  |       ^
                  v       |
                 child filter

In the child filter case, the child is fork/exec'd with two pipes
created for the child's stdin & stdout.  All the appropriate dup2()
"plumbing" is performed to redirect the child's stdin & stdout back
to the parent.  (Thanks to the FAQ and posts in this group!)

Because there is no guaranteed relationship between data to the
filter and data from it, deadlock is possible if the child blocks
with a large amount of output that the parent hasn't read and the
parent blocks on a write to the child's input pipe.  I _thought_ I
had taken care of this by having the parent use select() on the
child's fds so as only to read from or write to the child when
possible.  Writes are limited to PIPE_BUF bytes.

But on an  HP-UX A.09.05 A 9000/712 system, I seem to be getting
into this deadlock, despite the use of select().  Debug code
shows the I get FD_ISSET on the input fd to the child, but suspend
on the write.  I'm only writing 448 bytes at a time here, well
under PIPE_BUF, which is 8192 on this system (and under the Posix
512 to boot.)   Am I missing something here??  I thought that if
I only write to a pipe when select() indicates writable and keep
the writes under PIPE_BUF, I would not block on the write... This
seems to work fine on our other systems.

I replaced the called filter with a wrapper script:

#!/bin/sh
tee /tmp/fin | real_filter_command 2>>/tmp/ferror | tee /tmp/fout

and found that that /tmp/fin is always 8192 bytes (hmm, PIPE_BUF!)
long and /tmp/fout is usually 8192 also, although sometimes we get
a few k read back from the child before*.  (The filter is a
3rd party conversion utility which writes a quite large preamble
out before it starts its processing of the input data.

Perhaps I've done Something Idiotic (TM) here, but sure seems like
select() just lied to me.

Thanks!

Rich

 
 
 

pipe write() blocks even though select() indicates writable

Post by Chuck Dillo » Sat, 06 Jan 2001 05:12:38



> Because there is no guaranteed relationship between data to the
> filter and data from it, deadlock is possible if the child blocks
> with a large amount of output that the parent hasn't read and the
> parent blocks on a write to the child's input pipe.  I _thought_ I
> had taken care of this by having the parent use select() on the
> child's fds so as only to read from or write to the child when
> possible.  Writes are limited to PIPE_BUF bytes.

You are overdriving your filter program.  Select is telling you that
you can write at least 1 byte but not necessarily the number you have
to write.  Also when select says there is nothing to read that is at
a point in time.  That doesn't mean the filter program has processed
everything you have shoved into it yet.  Change your write code to
only write one byte each time select oks a write.

-- ced

--
Chuck Dillon
Senior Software Engineer
Genetics Computer Group, a subsidiary of Pharmacopeia, Inc.

 
 
 

pipe write() blocks even though select() indicates writable

Post by Rich Gra » Sat, 06 Jan 2001 06:22:06




> > Because there is no guaranteed relationship between data to the
> > filter and data from it, deadlock is possible if the child blocks
> > with a large amount of output that the parent hasn't read and the
> > parent blocks on a write to the child's input pipe.  I _thought_ I
> > had taken care of this by having the parent use select() on the
> > child's fds so as only to read from or write to the child when
> > possible.  Writes are limited to PIPE_BUF bytes.

> You are overdriving your filter program.  Select is telling you that
> you can write at least 1 byte but not necessarily the number you have
> to write.  Also when select says there is nothing to read that is at
> a point in time.  That doesn't mean the filter program has processed
> everything you have shoved into it yet.  Change your write code to
> only write one byte each time select oks a write.

> -- ced

> --
> Chuck Dillon
> Senior Software Engineer
> Genetics Computer Group, a subsidiary of Pharmacopeia, Inc.

Yech!  I'm moving way too much data to do it byte at a time...
Is there no way to write() in a manner that will take as much as
it can and return bytes written???  Man pages seem to say no.  If
I set O_NOBLOCK or O_NDELAY, if the write will not fit, NOTHING gets
written.  The only difference is the way the error is returned.
Seems like I'm looking for  O_WRITEWHATYOUCAN or better yet poll/select
with a watermark on writability.   There isn't such a thing, is there?

I can't think of a sane way to dealing with the not enough room
case where no bytes get written.  Since I don't know why the child
stalled, there seems to be no determistic way to delay.  I really don't
want to goto another process or threads (no experience with them and
portability concerns).  Another alternative would be to filter all the
output to a temp file and then send the temp file, but that is
unnecessary overhead too.

Any suggestions?

T H A N K S !!

Rich

 
 
 

pipe write() blocks even though select() indicates writable

Post by Eric Sosma » Sat, 06 Jan 2001 07:02:09



> [having deadlock problems writing to and reading from pipes]

> Yech!  I'm moving way too much data to do it byte at a time...
> Is there no way to write() in a manner that will take as much as
> it can and return bytes written???  Man pages seem to say no.  If
> I set O_NOBLOCK or O_NDELAY, if the write will not fit, NOTHING gets
> written.  The only difference is the way the error is returned.
> Seems like I'm looking for  O_WRITEWHATYOUCAN or better yet poll/select
> with a watermark on writability.   There isn't such a thing, is there?

    Not as far as I know.  Two suggestions, though: First, you
could use asynchronous I/O (man aio_write, et al.) if your system
supports it.  Failing that, you could multi-thread your program
so blockage on the input side won't stop the output, and vice
versa.

--

 
 
 

pipe write() blocks even though select() indicates writable

Post by Chuck Dillo » Sat, 06 Jan 2001 07:58:35



> Yech!  I'm moving way too much data to do it byte at a time...
> Is there no way to write() in a manner that will take as much as
> it can and return bytes written???  Man pages seem to say no.  If
> I set O_NOBLOCK or O_NDELAY, if the write will not fit, NOTHING gets
> written.  The only difference is the way the error is returned.
> Seems like I'm looking for  O_WRITEWHATYOUCAN or better yet poll/select
> with a watermark on writability.   There isn't such a thing, is there?

Sure, if you use both O_NOBLOCK and O_NDELAY in your pipe/open call
your write shouldn't block.  If your write calls returns -1 and
errno == EAGAIN you know the pipe has not been adequately emptied.
You can then go try and read some more.  That should work.

An alternative way to do it would be to set a timer before the write
so that the call couldn't block for more than some set time.  That
shouldn't be necessary though.

-- ced

--
Chuck Dillon
Senior Software Engineer
Genetics Computer Group, a subsidiary of Pharmacopeia, Inc.

 
 
 

pipe write() blocks even though select() indicates writable

Post by Rich Gra » Sun, 07 Jan 2001 02:26:47




> > [having deadlock problems writing to and reading from pipes]

> > Yech!  I'm moving way too much data to do it byte at a time...
> > Is there no way to write() in a manner that will take as much as
> > it can and return bytes written???  Man pages seem to say no.  If
> > I set O_NOBLOCK or O_NDELAY, if the write will not fit, NOTHING gets
> > written.  The only difference is the way the error is returned.
> > Seems like I'm looking for  O_WRITEWHATYOUCAN or better yet poll/select
> > with a watermark on writability.   There isn't such a thing, is there?

>     Not as far as I know.  Two suggestions, though: First, you
> could use asynchronous I/O (man aio_write, et al.) if your system
> supports it.  

Alas, it seems not to be portable to the systems I'm developing for.
Very intersting though for modern systems.

> Failing that, you could multi-thread your program
> so blockage on the input side won't stop the output, and vice
> versa.

> --


I don't think I have time to learn pthreads (although I'd love to) and
deal with all the portability issues with creaky OS versions.  It does
seem that my best bet would be to fork off another child to do the
backend function of reading from the pipe and writing to the socket.
I was hoping to avoid creating another child, but aside from that
overhead, the whole ugly business of the select() goes away.

Thanks!

Rich

 
 
 

pipe write() blocks even though select() indicates writable

Post by Rich Gra » Sun, 07 Jan 2001 03:35:33




> > Yech!  I'm moving way too much data to do it byte at a time...
> > Is there no way to write() in a manner that will take as much as
> > it can and return bytes written???  Man pages seem to say no.  If
> > I set O_NOBLOCK or O_NDELAY, if the write will not fit, NOTHING gets
> > written.  The only difference is the way the error is returned.
> > Seems like I'm looking for  O_WRITEWHATYOUCAN or better yet poll/select
> > with a watermark on writability.   There isn't such a thing, is there?

> Sure, if you use both O_NOBLOCK and O_NDELAY in your pipe/open call
> your write shouldn't block.  If your write calls returns -1 and
> errno == EAGAIN you know the pipe has not been adequately emptied.
> You can then go try and read some more.  That should work.

The filter may be blocking the parent's write because it is stalled
waiting for the parent to take its output.  OR it may be just slow on
taking the data from the parent, in which case all the parent needs
to do is wait to write more.  The problem is, I don't see any
determinstic way to figure when I can write my next buffer.

Quote:> An alternative way to do it would be to set a timer before the write
> so that the call couldn't block for more than some set time.  That
> shouldn't be necessary though.

Hmm, maybe I can either alarm() blocking writes or if I get an EAGAIN
from a non-blocking write, just ignore writeability for the next
second or so....   I'm effectively polling the write pipe by doing
this, but it certainly is a quick kludge to slap in.

The only way out of this hokeyness appears to be another thread/process.
I'm surprised by the lack of a write_what_you_can() functionality.

Quote:> -- ced

> --
> Chuck Dillon
> Senior Software Engineer
> Genetics Computer Group, a subsidiary of Pharmacopeia, Inc.

Thanks!

Rich

 
 
 

pipe write() blocks even though select() indicates writable

Post by Rich Gra » Sun, 07 Jan 2001 04:38:53


Oh, I see part of the problem here.  PIPE_BUF in this case is
NOT my friend!  Because of the requirement that writes of PIPE_BUF
or less to a pipe are guaranteed to be atomic, a non-blocking write
must either immediately succeed completely or write nothing.  But,
once PIPE_BUF is exceded, a non-blocking write will write whatever it
can.  So, paradoxically, _increasing_ the buffer size above PIPE_BUF
avoids the problem of getting nothing written when select says the pipe
is writable!  

This still leaves potential deadlocks on the final write though.

Rich

 
 
 

pipe write() blocks even though select() indicates writable

Post by David Schwart » Sun, 07 Jan 2001 04:49:35



> This still leaves potential deadlocks on the final write though.

        The best you can do is do the write non-blocking, and if it fails,
retry after a reasonable delay. Perhaps include the pipe in every other
call to select or some such.

        DS

 
 
 

pipe write() blocks even though select() indicates writable

Post by Rich Gra » Sun, 07 Jan 2001 05:14:46




> > This still leaves potential deadlocks on the final write though.

It is worse than just the final write.  Any partial write will open
the possibility of the remaining buffer size falling below PIPE_BUF.

Quote:>         The best you can do is do the write non-blocking, and if it fails,
> retry after a reasonable delay. Perhaps include the pipe in every other
> call to select or some such.

>         DS

Right.  I'm thinking in terms selecting on only the readfd with a
one second or so timeout after a write fail.

Rich

 
 
 

pipe write() blocks even though select() indicates writable

Post by David Schwart » Sun, 07 Jan 2001 05:54:44



> It is worse than just the final write.  Any partial write will open
> the possibility of the remaining buffer size falling below PIPE_BUF.

        Unfortunately, my news server is sick right now, so I missed the
original post. But most UNIXes I know won't tell you that you can write
on a pipe without blocking unless at least PIPE_BUF characters can be
written.

        DS

 
 
 

pipe write() blocks even though select() indicates writable

Post by Rich Gra » Sun, 07 Jan 2001 07:21:53




> > It is worse than just the final write.  Any partial write will open
> > the possibility of the remaining buffer size falling below PIPE_BUF.

>         Unfortunately, my news server is sick right now, so I missed the
> original post. But most UNIXes I know won't tell you that you can write
> on a pipe without blocking unless at least PIPE_BUF characters can be
> written.

>         DS

Well, then we have come full circle.  The behavior I was expecting was
that the pipe would not select() as writable unless PIPE_BUF bytes
could be written.  Perhaps this is a bug on this HP-UX A.09.05 A
9000/712
system where I have the problem.

Rich

 
 
 

pipe write() blocks even though select() indicates writable

Post by Stefaan A Eecke » Mon, 08 Jan 2001 06:37:20






>> > It is worse than just the final write.  Any partial write will open
>> > the possibility of the remaining buffer size falling below PIPE_BUF.

>>         Unfortunately, my news server is sick right now, so I missed the
>> original post. But most UNIXes I know won't tell you that you can write
>> on a pipe without blocking unless at least PIPE_BUF characters can be
>> written.

> Well, then we have come full circle.  The behavior I was expecting was
> that the pipe would not select() as writable unless PIPE_BUF bytes
> could be written.  Perhaps this is a bug on this HP-UX A.09.05 A
> 9000/712
> system where I have the problem.

I remember having a similar problem on a SINIX system way back
around 1991, using named pipes. We got around it by using
sockets.

--
Stefaan
--
Ninety-Ninety Rule of Project Schedules:
        The first ninety percent of the task takes ninety percent of
the time, and the last ten percent takes the other ninety percent.

 
 
 

1. select() and write() behavior with non-blocking named-pipe

I've got a process that sets a named-pipe to be non-blocking. The first
time that the process
enters a situation where the write() would block it adds the fd into the
select() calls set
for writers. The expectation was that the select() would indicate that
the fd was ready for
writing and the write() would be accomplished. On an HP/UX box we have a
situation
where the process that is doing this logic starts to take a large
portion of the CPU -- just
as if it was polling the fd. I used the HP/UX tusc/truss command and it
shows that what is
happening is that the select() returns a 1 (with the indication that the
fd that is being tested
for writing is set), but the write() returns a 0 and nothing is written.
I have always been
under the impression that write() would return a non-zero value and that
select() would not
return the fd for writing if it wasn't really ready for writing.

What is the problem with the logic in this case?

Thanks for the help!

Roger

2. passwords

3. select(), non blocking writes to pipes and EAGAIN

4. question about Buffalo LGY-PCI-TXL lan card?

5. Select() returning 0 even though data on socket

6. Sound on Pismo?

7. select() returns and indicates writability before a non-blocking TCP socket has actually connected.

8. Riva TnT XFree server patch source

9. why script suspended (tty output) even though it writes nth?

10. Does EAGAIN need to be checked when poll(2) indicates fd readable/writable?

11. select() is not blocking on (named) pipes

12. write(), even with O_NDELAY still blocks!?!

13. [2.2] pipe_write can block even with non-blocking fd