write(), even with O_NDELAY still blocks!?!

write(), even with O_NDELAY still blocks!?!

Post by Eyal Lebedins » Wed, 09 Feb 1994 15:50:03




Quote:>Is it possible (correct?) for write() on a descriptor marked for
>non-blockion I/O to block anyway?  I've been having a problem
>in which any time I have more than a few open named pipes,
>write() seems to block even though the fd is set using O_NDELAY,
>and the buffer is not full.
>Sun's man page seems to suggest that write() should return -1
>with EWOULDBLOCK or EAGAIN instead of blocking.  What is it I
>did wrong in the program below:
> --------------------------------
>#include <sys/types.h>
>#include <fcntl.h>
>#include <errno.h>
>main()
>{
> int i,j,k,l,err;
> char buf[100];
> for (i=0;i<50;i++) {
>     sprintf(buf,"fifo%d",i);
>     j = mkfifo(buf,0777);          if (j<0) {perror("mkfifo");}
>     k = open(buf,O_RDONLY|O_NDELAY);   if (k<0) perror("open(RDONLY)");
>     l = open(buf,O_WRONLY|O_NDELAY);   if (l<0) perror("open(WRONLY)");
>     if (fcntl(l,F_SETFL,fcntl(l,F_GETFL,0)|O_NDELAY)<0) perror("fcntl");
>     err = write(l,buf,1);              if (err<0) perror("write");
>     err = read( k,buf,1);          if (err<0) perror("read");
>/*   close(l);  (see below) */
>     close(k);
>     printf("cool %d\n",i);
> }
> sleep(30);
>}
> --------------------------------
>Yes, I know that the close() statement is "incorrectly" commented
>out.  The idea is to simulate a number of different processes which
>all have a named pipe open for reading.  This was the simplest
>program I could think of that reproduces the problem.
>In case it matters, This is on a Sun sparc 2
> % showrev -a
> ***************  showrev version 1.15  *****************
> * Hostname: "grizzly"
> * Hostid: "7230209f"
> * Kernel Arch: "sun4m"
> * Application Arch: "sun4"
> * Kernel Revision:
>   4.1.3 (GENERIC) #3: Mon Jul 27 16:44:16 PDT 1992
> * Release: 4.1.3
>The problem doesn't seem to occur on a system running Solaris, so
>it might be an OS rather than a programming question.

[program deleted]

Just a wild pointer, but it did happen to me. Maybe you need to use the
O_NONBLOCK rather than the O_NDELAY. I found systems where both are
defined, have different values and the doco is confused which one to use
when. Good luck.

>    Thanks,
>    Ron Mayer


--
Regards

 
 
 

write(), even with O_NDELAY still blocks!?!

Post by P D » Sat, 12 Feb 1994 15:15:14



>Just a wild pointer, but it did happen to me. Maybe you need to use the
>O_NONBLOCK rather than the O_NDELAY. I found systems where both are
>defined, have different values and the doco is confused which one to use
>when. Good luck.

I've found that which to use is unpredictable.  I was using O_NONBLOCK
my itself on SunOS 4.1.3 and that didn't work (connect() blocked for me).
I suspect that which to use might depend on what you are doing.  Maybe
it needs O_NDELAY for accept() and connect() and O_NONBLOCK for read()
and write().  I don't know because what I did was code tests for both
of these symbols and also FNDELAY (which had yet a different bit value
on SunOS 4.1.3) and or together all that is defined:

        fcntl( port->fd_listen , F_SETFL , 0
#ifdef O_NONBLOCK
                | O_NONBLOCK
#endif
#ifdef O_NDELAY
                | O_NDELAY
#endif
#ifdef FNDELAY
                | FNDELAY
#endif
        );

Hope that helps.
--
Phil Howard KA9WGN        | "It is good to keep a gun for peaceful purposes,

"If I was able to fix it, | designer of the Avtomat Kalashnikova 1947, while
it must have been broken" | at the Dallas 1994 Shot Show, Wed 12 Jan 1994.

 
 
 

write(), even with O_NDELAY still blocks!?!

Post by Tye McQue » Mon, 14 Feb 1994 14:43:48


I've set follow-ups to comp.unix.programmer *only* as this seems
the most appropriate group.



>>Just a wild pointer, but it did happen to me. Maybe you need to use the
>>O_NONBLOCK rather than the O_NDELAY. I found systems where both are
>>defined, have different values and the doco is confused which one to use
>>when. Good luck.

>I've found that which to use is unpredictable.  I was using O_NONBLOCK
>my itself on SunOS 4.1.3 and that didn't work (connect() blocked for me).
>I suspect that which to use might depend on what you are doing.  Maybe
>it needs O_NDELAY for accept() and connect() and O_NONBLOCK for read()
>and write().  I don't know because what I did was code tests for both
>of these symbols and also FNDELAY (which had yet a different bit value
>on SunOS 4.1.3) and or together all that is defined:

I'd suggest you consider reading W. Richard Stevens' excelent book,
_Advanced Programming in the Unix Environment_ [Addison-Wesley
Professional Computing Series, ISBN 0-201-56317-7].  It describes
the differences between O_NDELAY, O_NONBLOCK, and FNDELAY.  I'll
summarize:

O_NDELAY was introduced in an earlier release of System V.  The main
problem with it was that it caused read() to return 0 for both the
no-data-available-at-the-moment condition and the end-of-file condition.
SVR4 supports O_NDELAY for backward compatibility, keeping this problem.
It should be avoided in new applications if possible.

4.3 BSD provided the FNDELAY flag for fcntl().  It avoided the above
problem by having read() return -1 while setting errno to EWOULDBLOCK
if no data is available at the moment.  It is also different in that
it affects the actual tty or socket rather than just how you access it
so *any* process accessing the same tty/socket is affected when you
use FNDELAY.

POSIX.1 standardizes things a bit with O_NONBLOCK.  It is set in the
file table entry for an open file so using it doesn't generally
affect other processes.  It has read() return -1 for no-data-now
so there is no ambiguity.  However, it sets errno to EAGAIN, not
EWOULDBLOCK (unless you have a system where these are defined to
be the same -- I think such systems exists, but I haven't checked).

O_NONBLOCK is suppported in SVR4 and 4.4BSD.  In 4.4BSD though,
O_NONBLOCK is just a synonym for FNDELAY (ie. O_NONBLOCK affects
the actual device/socket and sets errno to EWOULDBLOCK -- unless
this changed after the book was published, I haven't checked --
the book was written slightly before 4.4BSD was released).

This is more of a history lesson than a solution to your problem.
I would suggest using only O_NONBLOCK when possible.  It sounds
like it is broken on that Sun system, though.

Hope that helps.
Tye

     Nothing is obvious unless you are overlooking something.

 
 
 

1. pipe write() blocks even though select() indicates writable

Howdy folks,

I'm having trouble dealing with passing data through a child filter
process.   My app is sending info to a socket, with an optional filter
sub process.   The write routine either just passes output through to
the socket write or writes it down the pipe to the child and reads the
data from the child and writes it on to the socket.

  output ----------------------------> socket

or
                 wfd     rfd
  output ---------+       +----------> socket
                  |       ^
                  v       |
                 child filter

In the child filter case, the child is fork/exec'd with two pipes
created for the child's stdin & stdout.  All the appropriate dup2()
"plumbing" is performed to redirect the child's stdin & stdout back
to the parent.  (Thanks to the FAQ and posts in this group!)

Because there is no guaranteed relationship between data to the
filter and data from it, deadlock is possible if the child blocks
with a large amount of output that the parent hasn't read and the
parent blocks on a write to the child's input pipe.  I _thought_ I
had taken care of this by having the parent use select() on the
child's fds so as only to read from or write to the child when
possible.  Writes are limited to PIPE_BUF bytes.

But on an  HP-UX A.09.05 A 9000/712 system, I seem to be getting
into this deadlock, despite the use of select().  Debug code
shows the I get FD_ISSET on the input fd to the child, but suspend
on the write.  I'm only writing 448 bytes at a time here, well
under PIPE_BUF, which is 8192 on this system (and under the Posix
512 to boot.)   Am I missing something here??  I thought that if
I only write to a pipe when select() indicates writable and keep
the writes under PIPE_BUF, I would not block on the write... This
seems to work fine on our other systems.

I replaced the called filter with a wrapper script:

#!/bin/sh
tee /tmp/fin | real_filter_command 2>>/tmp/ferror | tee /tmp/fout

and found that that /tmp/fin is always 8192 bytes (hmm, PIPE_BUF!)
long and /tmp/fout is usually 8192 also, although sometimes we get
a few k read back from the child before hanging.  (The filter is a
3rd party conversion utility which writes a quite large preamble
out before it starts its processing of the input data.

Perhaps I've done Something Idiotic (TM) here, but sure seems like
select() just lied to me.

Thanks!

Rich

2. overclocking CPUs - turn up the temp.

3. [2.2] pipe_write can block even with non-blocking fd

4. what happened?

5. [2.5] Non-blocking write can block

6. Sony GDM1950 at CLGD5422 Videocard ?

7. O_DIRECT write to file by block-aligned, block-multiple buf fails?

8. Help NetBSD - WinSock 2.0 problem

9. incomplete socket writes (when write blocks)

10. poll() blocks even if timeout is given

11. how to block execution for milliseconds or even smaller fraction

12. Access to in.ftpd even if tcpd blocks it ?

13. Bad Block Map (even after proper fsck)