Delayed Acks

Delayed Acks

Post by Michael B Alle » Mon, 30 Jun 2003 12:46:36



I'm working on some code that reads data from a client socket. When I
compare my implementation to another it's performance is horrible. From
looking at Ethereal I can see the ACKs are taking a loooong time. Each is
sent about 399ms vs 19ms with the fast client. Besides the ACKs the other
types of messages have much smaller response times like 14ms. The ACK
times are huge. Why is this happening? I tried TCP_NODELAY without
success (no difference). Any ideas?

Mike

 
 
 

Delayed Acks

Post by shadow » Tue, 01 Jul 2003 00:44:57



Quote:> I'm working on some code that reads data from a client socket. When I
> compare my implementation to another it's performance is horrible. From
> looking at Ethereal I can see the ACKs are taking a loooong time. Each is
> sent about 399ms vs 19ms with the fast client. Besides the ACKs the other
> types of messages have much smaller response times like 14ms. The ACK
> times are huge. Why is this happening? I tried TCP_NODELAY without
> success (no difference). Any ideas?

Could you post the I/O bandwidth you're getting, the datalink,
and maybe a snippet of what your code looks like when you're
reading.

--
Thamer Al-Harbash    http://www.whitefang.com/

 
 
 

Delayed Acks

Post by DINH Viêt Ho » Tue, 01 Jul 2003 01:18:50


shadows wrote :


> > I'm working on some code that reads data from a client socket. When I
> > compare my implementation to another it's performance is horrible. From
> > looking at Ethereal I can see the ACKs are taking a loooong time. Each is
> > sent about 399ms vs 19ms with the fast client. Besides the ACKs the other
> > types of messages have much smaller response times like 14ms. The ACK
> > times are huge. Why is this happening? I tried TCP_NODELAY without
> > success (no difference). Any ideas?

> Could you post the I/O bandwidth you're getting, the datalink,
> and maybe a snippet of what your code looks like when you're
> reading.

you certainly means the round-trip.

--
DINH V. Hoa,

 
 
 

Delayed Acks

Post by David Schwart » Tue, 01 Jul 2003 06:25:26




Quote:> I'm working on some code that reads data from a client socket. When I
> compare my implementation to another it's performance is horrible. From
> looking at Ethereal I can see the ACKs are taking a loooong time. Each is
> sent about 399ms vs 19ms with the fast client. Besides the ACKs the other
> types of messages have much smaller response times like 14ms. The ACK
> times are huge. Why is this happening? I tried TCP_NODELAY without
> success (no difference). Any ideas?

    I can't understand what you're saying. You say the performance is
horrible. What do you mean by performance? Do you mean throughput? Latency?
Why are you investigating microscopic timing details? Why do you care how
long the ACKs take? Most likely, the program that sends the data is doing
something wrong, but it's hard to say because your description is at too low
a level.

    DS

 
 
 

Delayed Acks

Post by Michael B Alle » Tue, 01 Jul 2003 15:22:22





>> I'm working on some code that reads data from a client socket. When I
>> compare my implementation to another it's performance is horrible. From
>> looking at Ethereal I can see the ACKs are taking a loooong time. Each
>> is sent about 399ms vs 19ms with the fast client. Besides the ACKs the
>> other types of messages have much smaller response times like 14ms. The
>> ACK times are huge. Why is this happening? I tried TCP_NODELAY without
>> success (no difference). Any ideas?

>     I can't understand what you're saying. You say the performance is
> horrible. What do you mean by performance? Do you mean throughput?
> Latency? Why are you investigating microscopic timing details? Why do
> you care how long the ACKs take? Most likely, the program that sends the
> data is doing something wrong, but it's hard to say because your
> description is at too low a level.

The client is basically writing a file of 1813612 bytes that the server
just writes to disk. I'm reading it in like this:

    while (tot > 0) {                    /* tot read in request is 64K */
        unsigned char buf[8192];
        n = MIN(8192, tot);
        if ((n = read(sockfd, buf, n)) == -1 ||
                    writen(fd, buf, n) == -1) {
            return -1;
        }
        tot -= n;
    }

Writen is just like in Steven's UNP. This takes about 8 seconds. The other
server that I'm comparing against reads in the same file (from the same
client program) in less than a second. This was all conducted on loopback
on a 1.8GHz laptop. So I compared packet captures and what really jumps
out at me is that the ACKs are capratively large (39.9ms vs 1.3ms):

Fast Server

  61 0.000002  About 6K Written
  62 0.000150  2020 > 33804 [ACK] Seq=98307579 Ack=89462846 Win=12212 Len=0
  63 0.000288  Write Command Response
  64 0.000295  Write Command Request (64K)
  65 0.000179  2020 > 33804 [ACK] Seq=98307630 Ack=89468990 Win=12212 Len=0
  66 0.000007  About 6K Written
  67 0.000099  2020 > 33804 [ACK] Seq=98307630 Ack=89475134 Win=12212 Len=0
  68 0.000004  About 6K Written
  69 0.000131  2020 > 33804 [ACK] Seq=98307630 Ack=89481278 Win=12212 Len=0
  70 0.000005  About 6K Written
  71 0.000096  2020 > 33804 [ACK] Seq=98307630 Ack=89487422 Win=12212 Len=0
  72 0.000004  About 6K Written
  73 0.000130  2020 > 33804 [ACK] Seq=98307630 Ack=89493566 Win=12212 Len=0
  74 0.000005  About 6K Written
  75 0.000096  2020 > 33804 [ACK] Seq=98307630 Ack=89499710 Win=12212 Len=0
  76 0.000004  About 6K Written
  77 0.000134  2020 > 33804 [ACK] Seq=98307630 Ack=89505854 Win=12212 Len=0
  78 0.000004  About 6K Written
  79 0.000097  2020 > 33804 [ACK] Seq=98307630 Ack=89511998 Win=12212 Len=0
  80 0.000004  About 6K Written
  81 0.000135  2020 > 33804 [ACK] Seq=98307630 Ack=89518142 Win=12212 Len=0
  82 0.000005  About 6K Written
  83 0.000002  About 6K Written
  84 0.000149  2020 > 33804 [ACK] Seq=98307630 Ack=89527421 Win=12212 Len=0
  85 0.000286  Write Command Response
  86 0.000276  Write Command Request (64K)
  87 0.000178  2020 > 33804 [ACK] Seq=98307681 Ack=89533565 Win=12212 Len=0
  88 0.000006  About 6K Written
  89 0.000098  2020 > 33804 [ACK] Seq=98307681 Ack=89539709 Win=12212 Len=0
  90 0.000004  About 6K Written

My Server

  61 0.000091  About 16K Written
  62 0.039914  2020 > 33803 [ACK] Seq=4171193517 Ack=4164770646 Win=49149 Len=0
  63 0.000117  About 16K Written
  64 0.000207  2020 > 33803 [ACK] Seq=4171193517 Ack=4164786069 Win=49149 Len=0
  65 0.000118  Write Command Response
  66 0.000298  Write Command Request (64K)
  67 0.039258  2020 > 33803 [ACK] Seq=4171193568 Ack=4164802453 Win=49149 Len=0
  68 0.000087  About 16K Written
  69 0.039910  2020 > 33803 [ACK] Seq=4171193568 Ack=4164818837 Win=49149 Len=0
  70 0.000068  About 16K Written
  71 0.039947  2020 > 33803 [ACK] Seq=4171193568 Ack=4164835221 Win=49149 Len=0
  72 0.001828  About 16K Written
  73 0.000210  2020 > 33803 [ACK] Seq=4171193568 Ack=4164850644 Win=49149 Len=0
  74 0.000128  Write Command Response
  75 0.000312  Write Command Request (64K)
  76 0.037514  2020 > 33803 [ACK] Seq=4171193619 Ack=4164867028 Win=49149 Len=0
  77 0.000119  About 16K Written
  78 0.039873  2020 > 33803 [ACK] Seq=4171193619 Ack=4164883412 Win=49149 Len=0
  79 0.000071  About 16K Written
  80 0.039932  2020 > 33803 [ACK] Seq=4171193619 Ack=4164899796 Win=49149 Len=0
  81 0.000148  About 16K Written
  82 0.000204  2020 > 33803 [ACK] Seq=4171193619 Ack=4164915219 Win=49149 Len=0
  83 0.000115  Write Command Response
  84 0.000296  Write Command Request (64K)
  85 0.039239  2020 > 33803 [ACK] Seq=4171193670 Ack=4164931603 Win=49149 Len=0
  86 0.001288  About 16K Written
  87 0.038718  2020 > 33803 [ACK] Seq=4171193670 Ack=4164947987 Win=49149 Len=0
  88 0.000159  About 16K Written
  89 0.039830  2020 > 33803 [ACK] Seq=4171193670 Ack=4164964371 Win=49149 Len=0
  90 0.000071  About 16K Written

The initial write request has the first chunk (16K for My Server and 6K
for Fast Server). I writen that to the file and enter the code listing
above. Changing the size of buf in that loop has no effect. I don't
see why this is slower. It's just a trivial loop. There's something
fundamentally different about these socket modes or something. The
only difference on the wire that I can see is that the "Fast Server"
is receiving 6K chunks vs. 16K chunks with the "My Server". So I'm
getting 4 16K chunks and "Fast Server" is getting about 11 6K chunks
(but it's neither of us are piggie-backing the ACKs).

lsof doesn't report anything special in the FD column.

fastserver 3566 root 9u IPv4 3208591  TCP *:2020 (LISTEN)
myserver   3529 root 3u IPv4 3190923  TCP *:2020 (LISTEN)

Mike

 
 
 

Delayed Acks

Post by David Schwart » Tue, 01 Jul 2003 18:56:21




Quote:> Writen is just like in Steven's UNP. This takes about 8 seconds. The other
> server that I'm comparing against reads in the same file (from the same
> client program) in less than a second. This was all conducted on loopback
> on a 1.8GHz laptop. So I compared packet captures and what really jumps
> out at me is that the ACKs are capratively large (39.9ms vs 1.3ms):

    I don't understand what you're comparing to what. You have two different
server machines? Or two different server programs? Is this read/write loop
part of a server program?

    Measuring speed on a loopback interface is likely to give you more or
less meaningless numbers. Slower programs can seem to perform faster just
because there are fewer context switches.

    DS

 
 
 

Delayed Acks

Post by Lorinczy Zsigmon » Wed, 02 Jul 2003 02:30:14






>>>I'm working on some code that reads data from a client socket. When I
>>>compare my implementation to another it's performance is horrible. From
>>>looking at Ethereal I can see the ACKs are taking a loooong time. Each
>>>is sent about 399ms vs 19ms with the fast client. Besides the ACKs the
>>>other types of messages have much smaller response times like 14ms. The
>>>ACK times are huge. Why is this happening? I tried TCP_NODELAY without
>>>success (no difference). Any ideas?

>>    I can't understand what you're saying. You say the performance is
>>horrible. What do you mean by performance? Do you mean throughput?
>>Latency? Why are you investigating microscopic timing details? Why do
>>you care how long the ACKs take? Most likely, the program that sends the
>>data is doing something wrong, but it's hard to say because your
>>description is at too low a level.

> The client is basically writing a file of 1813612 bytes that the server
> just writes to disk. I'm reading it in like this:

>     while (tot > 0) {                    /* tot read in request is 64K */
>         unsigned char buf[8192];
>         n = MIN(8192, tot);
>         if ((n = read(sockfd, buf, n)) == -1 ||
>                     writen(fd, buf, n) == -1) {
>             return -1;
>         }
>         tot -= n;
>     }

You may try to play with setsockopt/getsockopt (SO_SNDBUF)
 
 
 

Delayed Acks

Post by James Antil » Wed, 02 Jul 2003 02:12:21






>>> I'm working on some code that reads data from a client socket. When I
>>> compare my implementation to another it's performance is horrible. From
>>> looking at Ethereal I can see the ACKs are taking a loooong time. Each
>>> is sent about 399ms vs 19ms with the fast client. Besides the ACKs the
>>> other types of messages have much smaller response times like 14ms. The
>>> ACK times are huge. Why is this happening? I tried TCP_NODELAY without
>>> success (no difference). Any ideas?

 TCP_NODELAY and TCP_CORK are both for write()'s, and depending on what
you want both may help in this case.

Quote:> The client is basically writing a file of 1813612 bytes that the server

 1_813_612 ... I presume the sender loops doing a write()?

Quote:> just writes to disk. I'm reading it in like this:

>     while (tot > 0) {                    /* tot read in request is 64K */

 This suggests that the sender is writting 64K/1_813_612 segments ?

Quote:>         unsigned char buf[8192];
>         n = MIN(8192, tot);
>         if ((n = read(sockfd, buf, n)) == -1 ||
>                     writen(fd, buf, n) == -1) {

 Given that the server is write()'ing directly to disk, have you measured
how long each request is taking (it's possible that you are blocking on
the disk).

Quote:>             return -1;
>         }
>         tot -= n;
>     }

[snip ... ]

Quote:> The initial write request has the first chunk (16K for My Server and 6K
> for Fast Server). I writen that to the file and enter the code listing
> above. Changing the size of buf in that loop has no effect. I don't
> see why this is slower. It's just a trivial loop. There's something
> fundamentally different about these socket modes or something. The

 Are the only differences the program listening on 2020 ?

Quote:> only difference on the wire that I can see is that the "Fast Server"
> is receiving 6K chunks vs. 16K chunks with the "My Server". So I'm
> getting 4 16K chunks and "Fast Server" is getting about 11 6K chunks
> (but it's neither of us are piggie-backing the ACKs).

 Again these numbers suggest 64K barriers, why aren't you sending as much
data as possible?

 As David said, you should check how different the results are over
ethernet.

--

Need an efficent and powerful string library for C?
http://www.and.org/vstr/

 
 
 

Delayed Acks

Post by Michael B Alle » Wed, 02 Jul 2003 04:53:38



>> The client is basically writing a file of 1813612 bytes that the server

>  1_813_612 ... I presume the sender loops doing a write()?

1,823,612 bytes and yes, that's what the server code below is doing.

Quote:>> just writes to disk. I'm reading it in like this:

>>     while (tot > 0) {                    /* tot read in request is 64K
>>     */

>  This suggests that the sender is writting 64K/1_813_612 segments ?

The client writes a command that has a header in which the payload is
described as 64K. My Server just reads in 8K chunks and writes it to
disk. I don't know what Fast Server is doing.

Quote:>>         unsigned char buf[8192];
>>         n = MIN(8192, tot);
>>         if ((n = read(sockfd, buf, n)) == -1 ||
>>                     writen(fd, buf, n) == -1) {

>  Given that the server is write()'ing directly to disk, have you
>  measured
> how long each request is taking (it's possible that you are blocking on
> the disk).

I don't think that should cause ACKs to be delayed. If My Server did
nothing the server should read in ~SO_RCVBUF before stopping ACKs. The
"delayed ACK" is described on p 203 of UNP.

Quote:>> The initial write request has the first chunk (16K for My Server and 6K
>> for Fast Server). I writen that to the file and enter the code listing
>> above. Changing the size of buf in that loop has no effect. I don't see
>> why this is slower. It's just a trivial loop. There's something
>> fundamentally different about these socket modes or something. The

>  Are the only differences the program listening on 2020 ?

Correct. There's "My Server" and "Fast Server" that are both listening
on port 2020.

Quote:>> only difference on the wire that I can see is that the "Fast Server" is
>> receiving 6K chunks vs. 16K chunks with the "My Server". So I'm getting
>> 4 16K chunks and "Fast Server" is getting about 11 6K chunks (but it's
>> neither of us are piggie-backing the ACKs).

>  Again these numbers suggest 64K barriers, why aren't you sending as
>  much
> data as possible?

I have no control over the client but it is sending a command with a
header that says it's sending 64K. That 64K follows but when I mention
"chunks" above I mean TCP frames. I don't know if the client is writing
all 64K at once.

Quote:>  As David said, you should check how different the results are over
> ethernet.

Ok. It's a little difficult to setup at the moment.

Q: When I call setsockopt to set TCP_NO_DELAY I do it on the client
socket returned immediately after accept() returns and I pass option
value of 1. Is this correct? Do I need to set TCP_NO_DELAY on the server
socket? Steven's UNP described delayed ACKs on p. 203-204. It meantions
TCP_NO_DELAY but it's no obvious to me that this should help in this case.

Mike

 
 
 

Delayed Acks

Post by David Schwart » Wed, 02 Jul 2003 05:27:39





> The client writes a command that has a header in which the payload is
> described as 64K. My Server just reads in 8K chunks and writes it to
> disk. I don't know what Fast Server is doing.

    There could be issues with the implementation of the client, though I'm
not sure I can really see how. The client is never waiting for any
acknowledgement from the server, right? It just keeps blasting data at it?

Quote:> >>         unsigned char buf[8192];
> >>         n = MIN(8192, tot);
> >>         if ((n = read(sockfd, buf, n)) == -1 ||
> >>                     writen(fd, buf, n) == -1) {
> >  Given that the server is write()'ing directly to disk, have you
> >  measured
> > how long each request is taking (it's possible that you are blocking on
> > the disk).
> I don't think that should cause ACKs to be delayed. If My Server did
> nothing the server should read in ~SO_RCVBUF before stopping ACKs. The
> "delayed ACK" is described on p 203 of UNP.

    How many bytes are you seeing in each read/writen?

    The thing is, suppose we assume that you're blocking in writen. That
would mean that by the time you get back to 'read', there should be lots of
data waiting for you. In that case, your next writen should be much more
efficient.

    Maybe something is mucking with the receive buffer size? Does the code
that does the read/writen ever muck with receive buffer sizes? Because you
often can't put it back.

    DS

 
 
 

Delayed Acks

Post by Michael B Alle » Wed, 02 Jul 2003 17:22:57



>> The client writes a command that has a header in which the payload is
>> described as 64K. My Server just reads in 8K chunks and writes it to
>> disk. I don't know what Fast Server is doing.

>     There could be issues with the implementation of the client, though
>     I'm
> not sure I can really see how. The client is never waiting for any
> acknowledgement from the server, right? It just keeps blasting data at
> it?

Yup.

I tried all of this over ethernet like you had suggested. The ACK times
are way down to about 0.14ms. Fast Server still looks a hair faster with
ACK times of about 0.11ms but I hardly did enough testing to attribute
any statistical significance to these numbers. Overall write speed was
basically the same with 974 KB/sec with My Server vs. 997 KB/sec with Fast
Server. I was seeing ~500 KB/sec over loopback with My Server.

I also tried another test. I removed the write in writen so that it
simply returned immediately. No difference over ethernet or loopback.

Quote:>> >>         unsigned char buf[8192];
>> >>         n = MIN(8192, tot);
>> >>         if ((n = read(sockfd, buf, n)) == -1 ||
>> >>                     writen(fd, buf, n) == -1) {

>> >  Given that the server is write()'ing directly to disk, have you
>> >  measured
>> > how long each request is taking (it's possible that you are blocking
>> > on the disk).

>> I don't think that should cause ACKs to be delayed. If My Server did
>> nothing the server should read in ~SO_RCVBUF before stopping ACKs. The
>> "delayed ACK" is described on p 203 of UNP.

>     How many bytes are you seeing in each read/writen?

In My Server it's 16K. I don't know what it is in Fast Server. I don't
even know how Fast Server works. But 16K is the size of the frames
on the wire. The size of the frames on the wire with Fast Server were
6K. This is the part that actually wigs me out. How are the TCP frame
sizes negotiated?

Quote:>     The thing is, suppose we assume that you're blocking in writen. That
> would mean that by the time you get back to 'read', there should be lots
> of data waiting for you. In that case, your next writen should be much
> more efficient.

>     Maybe something is mucking with the receive buffer size? Does the
>     code
> that does the read/writen ever muck with receive buffer sizes? Because
> you often can't put it back.

Actualy I was trying different sizes but if I take out those setsockopt
calls the result is no different. The results I reported initially were
with options (determined using getsockopt):

TCP_NODELAY: 1
IPTOS_LOWDELAY: 0
IPTOS_THROUGHPUT: 0
SO_SNDBUF: 131070
SO_RCVBUF: 131070
SO_SNDLOWAT: 1
SO_RCVLOWAT: 1
SO_SNDTIMEO: 0
SO_RCVTIMEO: 0

If I do not set SO_{SND,RCV}BUF to 0xFFFF I see:

SO_SNDBUF: 50532
SO_RCVBUF: 87808

I also tried smaller buffers but didn't get the effect I was looking for.

I give up! My Server doesn't work over loopback. It's offical.

Mike

 
 
 

Delayed Acks

Post by dbti » Wed, 02 Jul 2003 21:23:37


Quote:

>>    How many bytes are you seeing in each read/writen?

> In My Server it's 16K. I don't know what it is in Fast Server. I don't
> even know how Fast Server works. But 16K is the size of the frames
> on the wire. The size of the frames on the wire with Fast Server were
> 6K. This is the part that actually wigs me out. How are the TCP frame
> sizes negotiated?

You can't have 16K frames on the wire on an Ethernet.
The largest data buffer size is 1500 bytes.

Are you talking about something else?

 
 
 

Delayed Acks

Post by Michael B Alle » Thu, 03 Jul 2003 04:53:24



>>>    How many bytes are you seeing in each read/writen?

>> In My Server it's 16K. I don't know what it is in Fast Server. I don't
>> even know how Fast Server works. But 16K is the size of the frames on
>> the wire. The size of the frames on the wire with Fast Server were 6K.
>> This is the part that actually wigs me out. How are the TCP frame sizes
>> negotiated?

> You can't have 16K frames on the wire on an Ethernet. The largest data
> buffer size is 1500 bytes.

> Are you talking about something else?

That was over loopback. I should not have said 'on the wire'. The
ethernet frames are indeed 1500. This whole problem is describing
loopback. Ethernet does not exhibit the same behavior. So it's no
big deal...

Mike

 
 
 

Delayed Acks

Post by David Schwart » Thu, 03 Jul 2003 11:15:03




Quote:> That was over loopback. I should not have said 'on the wire'. The
> ethernet frames are indeed 1500. This whole problem is describing
> loopback. Ethernet does not exhibit the same behavior. So it's no
> big deal...

    Loopback can be a strange beast. At least you know it's a
loopback-specific problem. It could be an issue with the particular OS build
you're using. It could also be something you can hack around by mucking with
socket buffer sizes.

    DS

 
 
 

Delayed Acks

Post by dbti » Thu, 03 Jul 2003 20:51:35




>>>>   How many bytes are you seeing in each read/writen?

>>>In My Server it's 16K. I don't know what it is in Fast Server. I don't
>>>even know how Fast Server works. But 16K is the size of the frames on
>>>the wire. The size of the frames on the wire with Fast Server were 6K.
>>>This is the part that actually wigs me out. How are the TCP frame sizes
>>>negotiated?

>>You can't have 16K frames on the wire on an Ethernet. The largest data
>>buffer size is 1500 bytes.

>>Are you talking about something else?

> That was over loopback. I should not have said 'on the wire'. The
> ethernet frames are indeed 1500. This whole problem is describing
> loopback. Ethernet does not exhibit the same behavior. So it's no
> big deal...

> Mike

Yeah, I saw that most of what you were doing was loopback, but I
thought maybe you'd actually tested in between two machines.