read(fd, buf, N) fails when N doesn't fit in 32 bits?

read(fd, buf, N) fails when N doesn't fit in 32 bits?

Post by Paul Egger » Wed, 03 Apr 2002 18:44:41



A Tru64 5.1 user reports that GNU Diffutils 2.8 doesn't work on his
system because the read(2) system call won't accept a size argument
greater than 2**31 - 1.  His truss output looks like this:

read(6, 0x0000000140015000, 5312561138)         Err#22 Invalid argument

Is there some compile-time option that he can use on Tru64 to work
around this problem?  Or perhaps there's a Tru64 patch that I can
recommend to him?

You can read his original bug report in:

http://groups.google.com/groups?hl=en&selm=200204012205.QAA0000012227...

and my proposed workaround in:

http://groups.google.com/groups?hl=en&selm=200204020013.g320DUf08044%...

This workaround limits the 'read' calls to less than 2**31 bytes, but
I'd prefer a solution that removes that arbitrary limit.  Sorry, I
don't use Tru64 myself, so I'm not well versed in its compile-time
options and patches.

 
 
 

read(fd, buf, N) fails when N doesn't fit in 32 bits?

Post by Johan Danielss » Wed, 03 Apr 2002 22:15:52



> Is there some compile-time option that he can use on Tru64 to work
> around this problem?

For obscure reasons, read doesn't accept large buffers:

        if (uap->count > INT_MAX)
                return (EINVAL);

A workaround using readv is probably the easiest route. It would be
fairly simple to autoconf a test for this.

/Johan

 
 
 

read(fd, buf, N) fails when N doesn't fit in 32 bits?

Post by Paul Egger » Wed, 03 Apr 2002 22:42:47



> For obscure reasons, read doesn't accept large buffers:

>    if (uap->count > INT_MAX)
>            return (EINVAL);

Ouch.

My first reaction was "this violates POSIX!" but I just re-read the
POSIX spec for "read" and it says "If the value of nbyte is greater
than {SSIZE_MAX}, the result is implementation-defined."

So -- what is the value of SSIZE_MAX on Tru64?  Does SSIZE_MAX ==
INT_MAX?  If so, then Tru64 conforms to POSIX after all, and I should
limit reads to at most SSIZE_MAX bytes.  I should do this anyway, to
stay within the letter of POSIX law.  If SSIZE_MAX == INT_MAX, that
would suffice to avoid this problem on Tru64.

Quote:> A workaround using readv is probably the easiest route.

Thanks, but that would cause porting problems on other hosts, alas.  I
think I'll limit myself to (2**31 - 1)-byte reads.
 
 
 

read(fd, buf, N) fails when N doesn't fit in 32 bits?

Post by Bob Harri » Fri, 05 Apr 2002 09:37:47





X
X > For obscure reasons, read doesn't accept large buffers:
X >
X >  if (uap->count > INT_MAX)
X >          return (EINVAL);
X
X Ouch.
X
X My first reaction was "this violates POSIX!" but I just re-read the
X POSIX spec for "read" and it says "If the value of nbyte is greater
X than {SSIZE_MAX}, the result is implementation-defined."
X
X So -- what is the value of SSIZE_MAX on Tru64?  Does SSIZE_MAX ==
X INT_MAX?  If so, then Tru64 conforms to POSIX after all, and I should
X limit reads to at most SSIZE_MAX bytes.  I should do this anyway, to
X stay within the letter of POSIX law.  If SSIZE_MAX == INT_MAX, that
X would suffice to avoid this problem on Tru64.
X
X > A workaround using readv is probably the easiest route.
X
X Thanks, but that would cause porting problems on other hosts, alas.  I
X think I'll limit myself to (2**31 - 1)-byte reads.

SSIZE_MAX on Tru64 UNIX is LONG_MAX (aka a signed 64 bit value).  

I am _NOT_ a POSIX lawyer, but the wording only says what nbyte is if it
is greater than SSIZE_MAX.  It does not say that nbyte needs to be as
large as SSIZE_MAX.  I know this is cutting it rather thin, but an
argument could be made along those lines.

There are lots of loop holes in the POSIX standard, this could be an
intentional loop hole, poor wording on the part of POSIS, or it could be
an Oops on the part of Compaq.

Your mileage may vary (mine always does :-)

                                        Bob Harris

 
 
 

read(fd, buf, N) fails when N doesn't fit in 32 bits?

Post by Paul Egger » Fri, 05 Apr 2002 16:48:20



> the wording only says what nbyte is if it is greater than SSIZE_MAX.

It's pretty clear from the standard that when nbyte is less than or
equal to SSIZE_MAX, 'read' is supposed to read all nbyte bytes from a
regular file.  'read' can return a value less than nbyte only at end
of file, or if there is an error, or if a signal occurs.  (The rules
are a bit different for non-regular files.)  There is no provision to
return -1 with errno==EINVAL merely because the buffer size is too
large.

The ssize_t type was established mostly to allow implementations to do
the right thing with buffers whose sizes did not fit in 'int'.  This
change occurred in POSIX 1003.1-1990; the earlier 1988 standard had
'read' returning 'int'.  Perhaps older versions of Tru64 were defined
according to the 1998 standard, and the incompatibility with newer
editions of the standard has never been fixed for some reason.

Also, POSIX says that 'readv' must fail with errno==EINVAL if the sum
of the buffer lengths exceeds SSIZE_MAX.  I wonder whether Tru64 readv
fails with errno==EINVAL if the sum of the buffer lengths exceeds
INT_MAX.  If so, that would be another incompatibility with POSIX.

 
 
 

read(fd, buf, N) fails when N doesn't fit in 32 bits?

Post by Johan Danielss » Sun, 07 Apr 2002 00:33:41



> I wonder whether Tru64 readv fails with errno==EINVAL if the sum of
> the buffer lengths exceeds INT_MAX.

There's a 64-bit "fixed" version of readv that you get automatically
if you include the proper header. Why there isn't an equivalent for
read is unknown to me.

/Johan

 
 
 

1. read(fd, buf, size) after shutdown(fd, SHUT_RD) implementation-defined

I notice some missing specification in the documentation on shutdown():

The Linux man page and glibc info page say that shutdown(fd, SHUT_RD)
shuts off "reception" of data on the socket. Single Unix Spec says
"disables further receive operations". That sounds like it applies to the
tcp/ip session, not to the process's ability to read() if the socket
read buffer still has data when shutdown() is called. Neither the man
pages, info pages, or SuS man page say anything about what the
semantics of read() should be after shutdown(fd, SHUT_RD) is called.

The SO_LINGER option only discusses the write() direction: "close() and
shutdown() will block until all data has been written". (I wonder how
O_NONBLOCK works with that, but that's a question for another day).

An ipc tutorial from Berkeley's CSRG, issued when 4.3 bsd was the current bsd
version, says:

  Should a user have no use for any pending data, it may perform a
  "shutdown" on the socket prior to closing it. This call is of the
  form:

      shutdown(s, how);

  where "how" is 0 if the user is no longer interested in reading
  data, 1 if no more data will be sent, or 2 if no data is to be
  sent or received.

This appears to simply assume that code is not going to call read()
on the socket after shutdown(fd, SHUT_RD), and so leaves the question
of how the kernel is to respond to such a read() call undefined.
(Joys of an ad-hoc spec defined by the code that first implemented it,
and whose later formalization failed to specify the  semantics of subsequent
file operations on the socket fd after shutdown().)

So I would merely take the position that "results of read(fd, buf, size)
after shutdown(fd, SHUT_RD) has been called are implementation-defined"
(meaning that any given kernel will do something in that situation, but
code cannot expect how one kernel reacts to read() after
shutdown(fd, SHUT_RD) to be portable to other systems or kernel
versions without some formal standard specifying those semantics).

It seems prudent to not rely on the socket's read buffer to still be available
to the process after telling the kernel that you no longer have any
interest in reading from it. Anything the kernel does with the socket's
read buffer after shutdown(fd, SHUT_RD) is unspecified and thus valid
by definition.

How quickly the kernel gets around to actually notifying the remote peer
that the socket is closed for writing is internal to the tcp/ip stack.
"Immediately" would be good, but it is not going to be seen by the reader
that called shutdown() in any case.

Mutliple reader/writer apps that share a socket should still use locks and
"open/closed for i/o flags" (that are checked once a process or thread
holds the lock) to insulate themselves from variations in
(implementation-defined) "i/o after shutdown()" behavior, imho.

Regards,

Clayton Weaver

"Everyone is ignorant, just about different things."  Will Rogers

2. Finding .c, .cc, .cpp, .cxx and uppercase variants in one go with "find" ?

3. Adaptec 2940 doesn't support 32-bit DMA?

4. Problems compiling kernel - more details

5. Solaris 9 installed with 32 & 64 bit supports but boots in 32 bit mode

6. ati rage pro and X windows help

7. 64-bit Solaris 7: wasn't Solaris 2.6 also 32-bit?

8. New Linux Page -- Preview

9. Can't use 16-bit, 32-bit modes

10. write (socket, buf, 0) and read (socket, buf, 0)

11. How 32 bit driver can be ported to 64 bit driver?

12. 32 bits vs. 64 bits Oracle

13. 32 bit or 64 bit