> > > > > > > > > Client and Server OS: SuSE 9.3 Pro 2.6.11 default kernel, neither
> > > > > > > > > machine patched after install from CDs.
> > > > > > > > > Communicating over: GigE LAN
> > > > > > > > > 1) Server (receiver) is consistently adversting a TCP RWIN of 32K.
> > > > > > > > > 2) Server consistently has TCP RECV-Q of 0.
> > > > > > > > > 3) Client (sender) consistently shows a TCP SEND-Q of 80K.
> > > > > > > > > 4) Socket is up and connection is ESTABLISHED from both sides.
> > > > > > > > > 5) No data is transmitted.
> > > > > > > > > To troubleshoot, I've torn down and re-established to connection
> > > > > > > > > countless times. There may be a trickle of data initially, but
> > > > > > > > > within a few seconds the client SEND-Q builds and transmission stops.
> > > > > > > > > Receiver's window size never goes below 32K.
> > > > > > > > > Never seen this kind of behavior before. If the server process was
> > > > > > > > > slow, I'd expect to see a RECV-Q buildup to go with the big SEND-Q.
> > > > > > > > I would use tcpdump or wireshark/ethereal to see what is being sent and
> > > > > > > > received.
> > > > > > > How do you think I know the TCP RWIN size for the server?
> > > > > > So, what does tcpdump shows you before the transfer stops?
> > > > > Looks like we're getting a lot of retransmissions, which I believe
> > > > > would clearly explain the full sender tcp send-q, the receiver empty
> > > > > tcp recv-q, and the receiver's normal receive window size.
> > > > > (Relative ACKs used for clarity).
> > > > []
> > > > In the final six tcp segments show that the last two 1448-byte segments
> > > > remain unacknowledged. The sender's tcp stack must retransmit the
> > > > segments as it happened in the packet dump before. The only explanation
> > > > I can think of why we can not see retransmissions in the dump, is that
> > > > the retransmissions never get to the network interface egress queue.
> > > > The queue is full when the network is congested or when the driver is
> > > > misbehaving. You may like to check ifconfig output, particularly if
> > > > errors, dropped and collisions fields are non zero for the related
> > > > interface.
> > > I've been checking /proc/net/dev regularly already. In sum, hundreds
> > > of thousands of packets out that interface, 34 errors and 0 drops.
> > > > Does send/write in the client return with an error and if so what is
> > > > errno value?
> > > I log if write() to that socket returns <= 0, and haven't seen any
> > > log messages for that event. I'll double-check that to make sure
> > > it's unlikely that a log message is getting lost.
> > > I'll also push to have the patches made current, just in case there's
> > > the driver is a factor and there's already a fix for it.
> > I've recently noticed that, even when the throughput is fine, that
> > the senders receive window hovers around 12. I take that to mean
> > that that TCP input buffer is consistently well-loaded.
> You probably have window scaling enabled by default, so 12 should be
> scaled (12 * 2 ^ wscale). http://tools.ietf.org/html/rfc1323#section-2
> The sender does not receive any data but ACKs, I'm not sure if its
> receive buffer size has relevance.
The sender has two of its four interfaces in promiscuous mode, and is
getting hit pretty hard 24x7 (i.e., 240+ million packets-in on eth0
alone, in the past six hours). Most of that is TCP.
Quote:> > Would another explanation for the lack of retransmissions in the dump
> > be that the ACKs are delayed through the TCP input buffer?
> > I've had the net.core.rmem_max and net.core.wmem_max set to 16MB for
> > a few weeks now. Maybe that's too big?
> I would try using another network card / driver to see if the hardware
> and the driver are good.
I'm heading to that course of action. It's complicated a little by
the two interfaces in promisc mode being on an optical GiG/E card,
which was supplied after-market by our hardware vendor and
retrofitted by Solaris SAs. I'll have to get a replacement (not
difficult) and updated drivers (if any, not difficult) and then get
into a remote production datacenter and perform surgery (logistically
messy, but not difficult).
I greatly appreciate every single thing you've offered for help.