Strange behavior of socket communication

Strange behavior of socket communication

Post by chi ju hua » Fri, 05 Feb 1993 14:09:55



Hi,

I have observed something that I can not understand and wish some guru out
there can explain it to me.

I wrote a pair of small programs to measure the transmission delay over the
network. One of the program is called "server" and the other is called
"client". The "server" simply echos whatever it receives from "client".
The communication is done through sockets. The programs exchange a message
of predefined length for predefined number of times and the time measurements
are taken from "client". The simple protocol, to transmit variable length
message, is as follows: for each message, it has a 16 byte header which
contains the length of the message body; therefore, for each write,
using write(), it writes a 16 byte header first then writes the body;
for each read, using read(), it reads the header first then, according to
the length specified in the header, it reads enough bytes for the body.
Sometimes it needs more than one read() to get a large message body.
This simple protocol works fine and is robust.

The timing measurements are u_time, s_time, which are taken from getrusage(),
clock, which is taken from clock(), and wall time, which is taken from
time(). All the timers are reset before the test loop and are measured right
after the end of that loop. The experiment is trying to measure the delay
with different message length.

The strange thing that I observed is,

When the message length is small, the communication takes a long time.
Though it reports small u_time, s_time and clock time, it does reports
huge wall time. (I can really feel it by waiting in front of the console).
It seems that the process is waiting something in the kernel.
When the message length is larger, it reports larger u_time, s_time and
clock time, as expected, yet the wall time remains roughly the same.
However, when the length is large to a point, while the u_time, s_time and
clock time are still increasing, the wall time drops significantly.
The magic number is around "1460" bytes. That is, based on the wall time,
to finish the test with 1460 byte message takes only 1/20 of the time needed
by the test with 16 or 32 or .. or 1459 byte message. Even with 8k byte
messages, it take 1/5 of the time needed by the test with 16 byte message.

I repeated the experiments many times on different machines, including
SPARCstation 1, SPARC SLC, SPARC ELC, SPARCstation 2 with SunOS 4.1, and
Encore with Umax 4.3. They all report the same phenomenon and the magic
number is always around 1460 (well, I did get 2900 once). However, I ported
the programs on Macintoshes, the communication between Macintoshes does not
have such phenomenon, yet the communication between Mac and Unix boxes does.

My question is:
Do I really hit any magic of socket/ethernet/UNIX/SunOS/read()/write()/*?

Any hint will be greatly appreciated.


=== One of the results (10000 iterations, round trip delay in micro seconds)

Msg_length      u_time    s_time        clock_time   wall_time (sec)
----------------------------------------------------------------------------
     16          350000   3000000         3349866       2008
     32          340000   3450000         3799848       2133
     64          340000   3110000         3449862       2004
    128          440000   3290000         3733184       2006
    256          300000   3060000         3366532       2132
    512          280000   2890000         3166540       2036
   1024          270000   3140000         3399864       2002
   2048         1500000  25980000        27482234        112
   4096         1700000  44070000        45781502        172
   8192         2800000 100580000       103379198        400

 
 
 

Strange behavior of socket communication

Post by Steven D. Majews » Wed, 17 Feb 1993 05:05:55




>The strange thing that I observed is,

>When the message length is small, the communication takes a long time.
>Though it reports small u_time, s_time and clock time, it does reports
>huge wall time. (I can really feel it by waiting in front of the console).
>It seems that the process is waiting something in the kernel.
>When the message length is larger, it reports larger u_time, s_time and
>clock time, as expected, yet the wall time remains roughly the same.
>However, when the length is large to a point, while the u_time, s_time and
>clock time are still increasing, the wall time drops significantly.
>The magic number is around "1460" bytes. That is, based on the wall time,
>to finish the test with 1460 byte message takes only 1/20 of the time needed
>by the test with 16 or 32 or .. or 1459 byte message. Even with 8k byte
>messages, it take 1/5 of the time needed by the test with 16 byte message.

>I repeated the experiments many times on different machines, including
>SPARCstation 1, SPARC SLC, SPARC ELC, SPARCstation 2 with SunOS 4.1, and
>Encore with Umax 4.3. They all report the same phenomenon and the magic
>number is always around 1460 (well, I did get 2900 once). However, I ported
>the programs on Macintoshes, the communication between Macintoshes does not
>have such phenomenon, yet the communication between Mac and Unix boxes does.

>My question is:
>Do I really hit any magic of socket/ethernet/UNIX/SunOS/read()/write()/*?

>Any hint will be greatly appreciated.


>=== One of the results (10000 iterations, round trip delay in micro seconds)

>Msg_length  u_time    s_time        clock_time   wall_time (sec)
>----------------------------------------------------------------------------
>     16              350000   3000000         3349866       2008
>     32              340000   3450000         3799848       2133
>     64              340000   3110000         3449862       2004
>    128              440000   3290000         3733184       2006
>    256              300000   3060000         3366532       2132
>    512              280000   2890000         3166540       2036
>   1024              270000   3140000         3399864       2002
>   2048             1500000  25980000        27482234        112
>   4096             1700000  44070000        45781502        172
>   8192             2800000 100580000       103379198        400

I assume this is TCP and NOT UDP ?
You didn't state that is the case in the message, but it sound like
this is STREAM and not DATAGRAM protocol.

TCP not only splits up and reassembles large packets into smaller ones
for sending through small windows ( Ethernet is about 1.5K max I think. )
but it is also _allowed_ to accumulate smaller packets into a larger one.
What you see may be the sending system waiting for either another packet
to fill up the buffer or a timeout to signal it to go ahead and sent anyway.

This option can usually be turned off - ( although I don't recall how
at the moment. ) I ran into it in the other direction - my Mac's kept
getting stuck when sending data to a unix host. The problem didn't show
up in extensive testing on a Mac using MacTCP on localtalk to a K-BOX,
but surfaced on a directly connected Ethernet Mac, where I presume the
buffersize that MacTCP was trying to fill was of a different size.

I don't know if this is what's causing your problem, but that magic
number ( 1460 ) sounds close to the max size for an ethernet TCP user-message
buffer. ( after ethernet and TCP/UP headers are subtracted from 1.5K ).

===============================================================================
 Steven D. Majewski                     University of *ia

 Voice: (804)-982-0831                  1300 Jefferson Park Avenue
 FAX:   (804)-982-1616                  C*tesville, VA 22908
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Former UVA Department of Physiology, now Department of Molecular Physiology
and Biological Physics! [ Still the same spacious offices in Jordan Hall
- only the letterhead has changed! ]

 
 
 

Strange behavior of socket communication

Post by David Nob » Thu, 18 Feb 1993 09:08:19





>>When the message length is small, the communication takes a long time.

>TCP not only splits up and reassembles large packets into smaller ones
>for sending through small windows ( Ethernet is about 1.5K max I think. )
>but it is also _allowed_ to accumulate smaller packets into a larger one.
>What you see may be the sending system waiting for either another packet
>to fill up the buffer or a timeout to signal it to go ahead and sent anyway.

>This option can usually be turned off - ( although I don't recall how
>at the moment.

I ran into this sending small amounts of data through a TCP socket.
This is how I got around it on a Sun (both Sun-3 & SPARC, I think).
Your mileage may vary:
#############################################################################

#include <sys/types.h>    /* I don't remember if these first two are */
#include <sys/socket.h>   /* necessary, but it probably won't hurt */

#include <netinet/in.h>
#include <netinet/tcp.h>

  /* turn off TCP's buffering algorithm so small packets won't get delayed */
  status = 1; /* int status */
  if (setsockopt(sock, IPPROTO_TCP, TCP_NODELAY,
        (char *) &status, sizeof(status)) < 0)
  {
    perror("setsockopt");
    exit(1);
    /* Of course, it might not be fatal for you - if delay is acceptable. */
  }

#############################################################################
Hope this helps,                                #       % ping elvis

 
 
 

Strange behavior of socket communication

Post by James C. Vlc » Sat, 20 Feb 1993 06:19:16


On a related note:  How can one set up a Unix-domain socket, opened
using popen(), to be nonbuffered?

What I'm trying to do is: I have one process which is generating data
in "real time" - every half a second or so it writes out the data onto
the file descriptor, which is read by a second process that then plots
the data in a line graph.

Aside from the predictable difficulties with popen() - eventually,
I'll really have to rewrite the thing to explicitly set up the pipe,
since popen() doesn't give the lattitude in child process handling
that's needed - things work OK, with the exception that one must wait
quite a while for the writing process to fill up the socket buffer.
I'd like to set the thing up to be line-buffered - a concept which
seems to exist only at the stdio level.

Jim Vlcek