tcp-ip using sockets: possible Bug ?

tcp-ip using sockets: possible Bug ?

Post by Thomas Fett » Sat, 08 Jul 1995 04:00:00



Hello Net,

I've got one strange problem with a tcp-connection on
solaris 2.4 workstations (sparc 10).
I open the connection on the one side (eliasd) with the following
 sequence:

  elias_socket=socket(AF_INET,SOCK_STREAM,0);
  setsockopt(basis_socket,SOL_SOCKET,SO_REUSEADDR,(char*)&one,sizeof(one));
  setsockopt(elias_socket,SOL_SOCKET,SO_LINGER,(char*)&one,sizeof(one));

  memset(&demon,'\0',sizeof(demon));
  demon.sin_port = htonl(ELIAS_PORT);

  bind(basis_socket,(struct sockaddr*)&demon,sizeof(demon));
  listen(basis_socket,5);

(I removed the error-checking from the code) Then i accept on the socket:

  s=accept(elias_socket,0,0);

The other side (client) does this:

  demon = socket(AF_INET,SOCK_STREAM,0);
  memset(&demon_addr,'\0',sizeof(struct sockaddr_in));
  hp = gethostbyname(host);
  memcpy((char*)&demon_addr.sin_addr,hp->h_addr_list[0],hp->h_length);
  demon_addr.sin_family = hp->h_addrtype;
  demon_addr.sin_port = htonl(ELIAS_PORT);

  result=connect(demon,&demon_addr,sizeof(demon_addr));

After the connect, the accept returns and i start sending bytes from
demon to elias_socket. It all seems to work, except when eliasd and client
are running on the same solaris2.4 machine: the first write from client
succeds but from the second write 5 bytes are lacking. This does not happen
when eliasd and the client are on different machines (be it sunos4 or
solaris). Any idea anyone??

Second question: How do  I put a timeout on my accept(), so that after a
certain amount of time accept returns and i can do some housekeping?
Do I have to select on elias_socket?

it would be nice to hear from you..

                Thomas Fettig

 
 
 

tcp-ip using sockets: possible Bug ?

Post by Sand » Sun, 09 Jul 1995 04:00:00




>;>Hello Net,
>;>
>;>I've got one strange problem with a tcp-connection on
>;>solaris 2.4 workstations (sparc 10).

>.. text removed to save space

>Yes, you use select for timeout, but a timeout would probably not be needed
>since an accept will be Input on your server socket, once accepted u have
>two active sockets the server and the client connection.

>Remember that sockets are considered slow I/O devices and can and do
>return less then the requested # of bytes on a recv/read.
>You must know (via data in msg) how much is coming and continue reading
>until u have it all.

 But just in case you do want to use a timeout you can use poll(), or
 select() before the accept().



 
 
 

tcp-ip using sockets: possible Bug ?

Post by Skip Doole » Sun, 09 Jul 1995 04:00:00


;>Hello Net,
;>
;>I've got one strange problem with a tcp-connection on
;>solaris 2.4 workstations (sparc 10).

.. text removed to save space

Yes, you use select for timeout, but a timeout would probably not be needed
since an accept will be Input on your server socket, once accepted u have
two active sockets the server and the client connection.

Remember that sockets are considered slow I/O devices and can and do
return less then the requested # of bytes on a recv/read.
You must know (via data in msg) how much is coming and continue reading
until u have it all.

Skip Dooley

 
 
 

1. Possible bug in TCP/IP stuff of kernel (0.99p5 on up).

I think I may have found a bug in the tcp/ip stuff of the kernel.  It was
very quickly discovered today when I upgraded my machine from 0.99p3 to
0.99p5!  When I booted with the new version of the kernel, the network was
flooded with messages and things started dropping off from the network, etc.
All in all, my machine was literally wreaking havoc our networks.

It was soon to be found out that every time a packet was coming into the
machine, it was sending out an ICMP reply (protocol 107 I do believe)
saying that such and such IP address (which I don't even think was valid)
could not be reached through that machine.  Well, I investigated this in
the source and discovered that indeed there is a new line in pl5 that was
not in pl3 that is causing all of the problems.

Here is the code from net/tcp/ip.c of the kernel source.  The difference
between this code and the code from older versions of the kernel is that
there is a new line, "icmp_reply(...)" (on line 837).  I strongly believe
that it should not be there and should be removed as soon as possible!

------------------------------------------------------------------------------
  /* for now we will only deal with packets meant for us. */
  if (!my_ip_addr(iph->daddr))
    {
        PRINTK(("\nIP: *** datagram routing not yet implemented ***\n"));
        PRINTK(("    SRC = %s   ", in_ntoa(iph->saddr)));
        PRINTK(("    DST = %s (ignored)\n", in_ntoa(iph->daddr)));
        icmp_reply (skb, ICMP_DEST_UNREACH, ICMP_PROT_UNREACH, dev);

       skb->sk = NULL;
       kfree_skb(skb, 0);
       return (0);
    }
------------------------------------------------------------------------------

The code specifically says that we are only dealing with packets meant for
us.  Well, why is it then that we are replying to every single packet that
is *not* meant for us?  This is mind boggling and in a network environment,
it can prove to be very hostile and deadly... (I won't tell you what we
think was happening to the network here at the University last weekend,
after I upgraded another machine to pl5...).

I will be commenting this out and recompiling the kernel... but I won't
get to test it out until tomorrow, with the aid of the sys admin here.
I think it will fix the problem.  Anyone else like to comment?

Scott Adkins

--



2. RH8 doesn't find hard drive with Promise Ultra66

3. Possible problem in 2.2.x tcp/ip when using netmask 255.255.255.224

4. ksh command line history

5. multicast using tcp/ip sockets

6. High Color on X

7. tcp/ip program ( using sockets )

8. FreeBSD and netatalk?

9. Converting to using TCP/IP (sockets)

10. Programming a server in Linux using TCP/IP Sockets

11. Broadcast using TCP/IP sockets, how to do?

12. TCP-IP socket: read function returns 0

13. x.25 socket and tcp/ip socket