close_wait states

close_wait states

Post by Robin Tutt » Sat, 18 Nov 2000 04:00:00



Folks,

I am having  a problem with CLOSE_WAIT states lingering on my Tru64
Unix v4.0F system.  We are running Oracle Application Server v8.0.5.
This is causing a problem with all of the sockets being uses, so that
noone can connect to the Web server.   What specific Unix parameters
should I be looking at.  Any recommendations?

Many thanks in advance!

Robin Tuttle
University of New Hampshire


 
 
 

close_wait states

Post by Barry Margoli » Sat, 18 Nov 2000 04:00:00




>I am having  a problem with CLOSE_WAIT states lingering on my Tru64
>Unix v4.0F system.  We are running Oracle Application Server v8.0.5.
>This is causing a problem with all of the sockets being uses, so that
>noone can connect to the Web server.   What specific Unix parameters
>should I be looking at.  Any recommendations?

This has nothing to do with Unix parameters.  CLOSE_WAIT means that the
client has closed its end of the connection, but the server hasn't yet
closed its end.  If these persist for a long time, it's a bug in the server
software.

In the case of a database, one way that these could occur without requiring
a bug is if a client submits a query that takes a long time to complete,
and he gets impatient waiting for a response so he cancels it.  That could
close the connection, but the server won't notice this until it finishes
processing the query and tries to send the results back.  But if you're
getting lots of connections like this, it can't be explained by a handful
of aborted queries.

--

Genuity, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

 
 
 

close_wait states

Post by netc » Sun, 19 Nov 2000 04:00:00


Hi!

Quote:>This has nothing to do with Unix parameters.  CLOSE_WAIT means that the
>client has closed its end of the connection, but the server hasn't yet
>closed its end.  If these persist for a long time, it's a bug in the server
>software.

Does it really? I worked with a cache-machine running linux 2.2 with 4000
clients accessing it, I also reached the maximum open sockets so I changed
some of the /proc/sys/net/ipv4/ parameters so TIME_WAIT timeout was lower.
I also raised the TCP_MAX_SYN_BACKLOG .. these changed made some radical
performance changes. But I have never worked with TRUE64, but I could be
wrong, maybe this doesn't affect the CLOSE_WAIT state of TCP.

Mvh
Martin Svensson
System Administrator            Netch Technologies AB
Tel: +46-(0)46-2724046

------------Your-planning-failures-are-not-my-emergencies---------------

 
 
 

close_wait states

Post by Jefferson Ogat » Sun, 19 Nov 2000 04:00:00



> Hi!

[ you were quoting Barry Margolin here; please retain attributions. ]

Quote:> >This has nothing to do with Unix parameters.  CLOSE_WAIT means that the
> >client has closed its end of the connection, but the server hasn't yet
> >closed its end.  If these persist for a long time, it's a bug in the server
> >software.

> Does it really? I worked with a cache-machine running linux 2.2 with 4000
> clients accessing it, I also reached the maximum open sockets so I changed
> some of the /proc/sys/net/ipv4/ parameters so TIME_WAIT timeout was lower.
> I also raised the TCP_MAX_SYN_BACKLOG .. these changed made some radical
> performance changes. But I have never worked with TRUE64, but I could be
> wrong, maybe this doesn't affect the CLOSE_WAIT state of TCP.

CLOSE_WAIT is a completely stable state, and properly shouldn't be considered a
bug, nor should it have a timeout. It simply means the incoming data stream has
been shut down, while the outgoing data stream remains open. FIN_WAIT_1 is the
corresponding stable state for the other end of the connection.

That said, it *usually* does imply a bug somewhere, since there are few
services that operate in this mode. But don't assume just from the fact that a
socket is in CLOSE_WAIT or FIN_WAIT_1 that something is wrong. E.g. a logging
service might accept TCP connections and shut down its outgoing stream, leaving
the client's socket in a perpetual CLOSE_WAIT state; the client may continue to
transmit data to the server in this state indefinitely.

--
Jefferson Ogata : Internetworker, Antibozo


 
 
 

close_wait states

Post by Jefferson Ogat » Sun, 19 Nov 2000 04:00:00



> Folks,

> I am having  a problem with CLOSE_WAIT states lingering on my Tru64
> Unix v4.0F system.  We are running Oracle Application Server v8.0.5.
> This is causing a problem with all of the sockets being uses, so that
> noone can connect to the Web server.   What specific Unix parameters
> should I be looking at.  Any recommendations?

Is there a packet-filtering device between the OAS machine and the clients
named in the remote address of the sockets left in CLOSE_WAIT? If so, it may be
dropping packets critical to the socket shutdown negotiation.

If you run netstat on the client machine, what state is the corresponding entry
left in? Is there anything in the SendQ or RecvQ for the socket at either end?

--
Jefferson Ogata : Internetworker, Antibozo


 
 
 

close_wait states

Post by Barry Margoli » Tue, 21 Nov 2000 04:00:00




Quote:>That said, it *usually* does imply a bug somewhere, since there are few
>services that operate in this mode.

CLOSE_WAIT is to TCP sockets as <defunct> is to processes: they're part of
the normal scheme of things, and there's nothing that automatically cleans
them up, but if a process leaves lots of them around, it probably indicates
a bug in the process.  And in both cases, killing the process that spawned
them will cause them to go away.

--

Genuity, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

 
 
 

close_wait states

Post by Barry Margoli » Tue, 21 Nov 2000 04:00:00





>> Folks,

>> I am having  a problem with CLOSE_WAIT states lingering on my Tru64
>> Unix v4.0F system.  We are running Oracle Application Server v8.0.5.
>> This is causing a problem with all of the sockets being uses, so that
>> noone can connect to the Web server.   What specific Unix parameters
>> should I be looking at.  Any recommendations?

>Is there a packet-filtering device between the OAS machine and the clients
>named in the remote address of the sockets left in CLOSE_WAIT? If so, it may be
>dropping packets critical to the socket shutdown negotiation.

I don't think so.  The process goes into CLOSE_WAIT state as a result of
receiving a FIN segment from the remote machine, so obviously that FIN
wasn't dropped.  When the process calls close(), the socket changes from
CLOSE_WAIT to LAST_ACK state, and a FIN is sent.  If the FIN or the
corresponding ACK is dropped the socket will hang in this state, not
CLOSE_WAIT.

--

Genuity, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

 
 
 

close_wait states

Post by Jefferson Ogat » Tue, 21 Nov 2000 04:00:00





> >That said, it *usually* does imply a bug somewhere, since there are few
> >services that operate in this mode.

> CLOSE_WAIT is to TCP sockets as <defunct> is to processes: they're part of
> the normal scheme of things, and there's nothing that automatically cleans
> them up, but if a process leaves lots of them around, it probably indicates
> a bug in the process.  And in both cases, killing the process that spawned
> them will cause them to go away.

Well, on Linux 2.2 at least, I've seen sockets stuck in CLOSE_WAIT with data in
the SendQ hang around indefinitely after the associated process is killed,
thereby preventing the service in question from being restarted by* onto
the local port number. So killing the process isn't always sufficient to get
rid of them. I don't recall seeing this on Unix hosts, but I wouldn't be at all
surprised.

--
Jefferson Ogata : Internetworker, Antibozo


 
 
 

close_wait states

Post by Jefferson Ogat » Tue, 21 Nov 2000 04:00:00






> >> Folks,

> >> I am having  a problem with CLOSE_WAIT states lingering on my Tru64
> >> Unix v4.0F system.  We are running Oracle Application Server v8.0.5.
> >> This is causing a problem with all of the sockets being uses, so that
> >> noone can connect to the Web server.   What specific Unix parameters
> >> should I be looking at.  Any recommendations?

> >Is there a packet-filtering device between the OAS machine and the clients
> >named in the remote address of the sockets left in CLOSE_WAIT? If so, it may be
> >dropping packets critical to the socket shutdown negotiation.

> I don't think so.  The process goes into CLOSE_WAIT state as a result of
> receiving a FIN segment from the remote machine, so obviously that FIN
> wasn't dropped.  When the process calls close(), the socket changes from
> CLOSE_WAIT to LAST_ACK state, and a FIN is sent.  If the FIN or the
> corresponding ACK is dropped the socket will hang in this state, not
> CLOSE_WAIT.

Consider the following scenario:

Client is coming through a stateful packet filter that has an idle timeout, or
a timeout on CLOSE_WAIT and FIN_WAIT states. Client connects to OAS and
transmits a query, then shuts down its write side. Server is now in CLOSE_WAIT.
OAS goes off for a few minutes cranking away on the query, during which time
the packet filter's timeout kicks off and drops the connection from its state
table. The server finishes the query and proceeds to transmit a full window of
response to the client. This forces it to block waiting for acknowledgements
that will never come, because the packet filter is now dropping the server's
traffic. Thus, the server is stuck in CLOSE_WAIT and unable to reach the code
that calls close() or shutdown().

This is why I asked the OP to check the SendQ.

--
Jefferson Ogata : Internetworker, Antibozo


 
 
 

close_wait states

Post by Jefferson Ogat » Wed, 22 Nov 2000 14:19:22


Posted this and the one that follows about 2.5 hours ago but they never showed
up. Must be news server trouble. Anyway...




> >That said, it *usually* does imply a bug somewhere, since there are few
> >services that operate in this mode.

> CLOSE_WAIT is to TCP sockets as <defunct> is to processes: they're part of
> the normal scheme of things, and there's nothing that automatically cleans
> them up, but if a process leaves lots of them around, it probably indicates
> a bug in the process.  And in both cases, killing the process that spawned
> them will cause them to go away.

Well, on Linux 2.2 at least, I've seen sockets stuck in CLOSE_WAIT with data in
the SendQ hang around indefinitely after the associated process is killed,
thereby preventing the service in question from being restarted by* onto
the local port number. So killing the process isn't always sufficient to get
rid of them. I don't recall seeing this on Unix hosts, but I wouldn't be at all
surprised.

--
Jefferson Ogata : Internetworker, Antibozo


 
 
 

close_wait states

Post by Jefferson Ogat » Wed, 22 Nov 2000 14:19:47






> >> Folks,

> >> I am having  a problem with CLOSE_WAIT states lingering on my Tru64
> >> Unix v4.0F system.  We are running Oracle Application Server v8.0.5.
> >> This is causing a problem with all of the sockets being uses, so that
> >> noone can connect to the Web server.   What specific Unix parameters
> >> should I be looking at.  Any recommendations?

> >Is there a packet-filtering device between the OAS machine and the clients
> >named in the remote address of the sockets left in CLOSE_WAIT? If so, it may be
> >dropping packets critical to the socket shutdown negotiation.

> I don't think so.  The process goes into CLOSE_WAIT state as a result of
> receiving a FIN segment from the remote machine, so obviously that FIN
> wasn't dropped.  When the process calls close(), the socket changes from
> CLOSE_WAIT to LAST_ACK state, and a FIN is sent.  If the FIN or the
> corresponding ACK is dropped the socket will hang in this state, not
> CLOSE_WAIT.

Consider the following scenario:

Client is coming through a stateful packet filter that has an idle timeout, or
a timeout on CLOSE_WAIT and FIN_WAIT states. Client connects to OAS and
transmits a query, then shuts down its write side. Server is now in CLOSE_WAIT.
OAS goes off for a few minutes cranking away on the query, during which time
the packet filter's timeout kicks off and drops the connection from its state
table. The server finishes the query and proceeds to transmit a full window of
response to the client. This forces it to block waiting for acknowledgements
that will never come, because the packet filter is now dropping the server's
traffic. Thus, the server is stuck in CLOSE_WAIT and unable to reach the code
that calls close() or shutdown().

This is why I asked the OP to check the SendQ.

--
Jefferson Ogata : Internetworker, Antibozo


 
 
 

close_wait states

Post by Andrew Mo » Wed, 22 Nov 2000 15:25:13




Quote:> I am having  a problem with CLOSE_WAIT states lingering on my Tru64
> Unix v4.0F system.  We are running Oracle Application Server v8.0.5.
> This is causing a problem with all of the sockets being uses, so that
> noone can connect to the Web server.   What specific Unix parameters
> should I be looking at.  Any recommendations?

We're running a similar setup here and have had the same problems.  We've
changed somaxconn and sominconn to 32767 (in the socket subsystem) and
tcp_keepalive_default=1 and tcp_keepidle=1200 in the inet subsystem.  All
these can be changed via dxkerneltuner.  See the online docs (via
www.tru64.org) for more information.

Regards,

Andrew
--

         Unix Environment Specialist, Information Technology Services
                    La Trobe University, Bundoora

 
 
 

close_wait states

Post by Anthony W. Youngma » Wed, 22 Nov 2000 04:00:00


I'm sure I've seen terminal servers stuck in the CLOSE_WAIT state. Real
bummer if it's the printer port and none of the monkeys cares to let you
know for a day that the printer isn't working...
-----Original Message-----

Posted At: 21 November 2000 02:40
Posted To: admin
Conversation: close_wait states
Subject: Re: close_wait states




> >That said, it *usually* does imply a bug somewhere, since there are
few
> >services that operate in this mode.

> CLOSE_WAIT is to TCP sockets as <defunct> is to processes: they're
part of
> the normal scheme of things, and there's nothing that automatically
cleans
> them up, but if a process leaves lots of them around, it probably
indicates
> a bug in the process.  And in both cases, killing the process that
spawned
> them will cause them to go away.

Well, on Linux 2.2 at least, I've seen sockets stuck in CLOSE_WAIT with
data in
the SendQ hang around indefinitely after the associated process is
killed,
thereby preventing the service in question from being restarted by
* onto
the local port number. So killing the process isn't always sufficient to
get
rid of them. I don't recall seeing this on Unix hosts, but I wouldn't be
at all
surprised.

--
Jefferson Ogata : Internetworker, Antibozo



 
 
 

close_wait states

Post by Barry Margoli » Wed, 22 Nov 2000 04:00:00




Quote:>Well, on Linux 2.2 at least, I've seen sockets stuck in CLOSE_WAIT with data in
>the SendQ hang around indefinitely after the associated process is killed,
>thereby preventing the service in question from being restarted by* onto
>the local port number. So killing the process isn't always sufficient to get
>rid of them. I don't recall seeing this on Unix hosts, but I wouldn't be at all
>surprised.

It actually makes sense.  If the send window is closed, the socket has to
stick around so it can keep retransmitting the data.

--

Genuity, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

 
 
 

1. How do you close a socket that has a CLOSE_WAIT state?

Hi,

I am running into a problem where a certain application that was written by
another organization is not closing down the TCP socket correctly on our
web servers.  when I do a netstat -a I see a bunch of entries like:

emxxdev1.41832      www.xxx.domain.com  8760      0  8760      0 CLOSE_WAIT

Does anyone know of a way to clear these without rebooting the box??

Thanks, Mark

--
============================================================================


============================================================================

2. Changing users pass in linux/C

3. Sockets in CLOSE_WAIT state

4. Linux as seocondary DNS of a Microsoft primary DNS?

5. CLOSE_WAIT state (apache 1.3.3 under Solaris 2.5)

6. Any Netware Client for FreeBSD ?

7. How to clear CLOSE/CLOSE_WAIT states?

8. Redhat 9 graphical install w/Dell laptop problems

9. On CLOSE_WAIT state of a Socket

10. CLOSE and CLOSE_WAIT states...

11. Socket in CLOSE_WAIT state although process is dead

12. CLOSE_WAIT state

13. Are there any limit to CLOSE_WAIT state connecitons