Linux 2.4.1 network (socket) performance

Linux 2.4.1 network (socket) performance

Post by Richard B. Johnso » Sat, 24 Feb 2001 01:11:28



Hello, I am trying to find the reason for very, very poor network
performance with sustained data transfers on Linux 2.4.1. I found
a work-around, but don't think user-mode code should have to provide
such work-arounds.

   In the following, with Linux 2.4.1, on a dedicated 100/Base
   link:

        s = socket connected to DISCARD (null-sink) server.

        while(len)
        {
            stat = write(s, buf, min(len, MTU));
            /* Yes, I do check for an error */
            buf += stat;
            len -= stat;
        }

Data length is 0x00010000 bytes.

MTU              Average trans rate   Fastest trans rate
----             -----------------    -----------------
65536            0.468 Mb/s           0.902 Mb/s
32768            0.684 Mb/s           0.813 Mb/s
16384            2.989 Mb/s           3.121 Mb/s
8192             5.211 Mb/s           6.160 Mb/s
4094             8.212 Mb/s           9.101 Mb/s
2048             8.561 Mb/s           9.280 Mb/s
1024             7.250 Mb/s           7.500 Mb/s
512              4.818 Mb/s           5.107 Mb/s

As you can see, there is a maximum data length that can be
handled with reasonable speed from a socket. Trying to find
out what that was, I discovered that the best MTU was 3924.
I don't know why. It shows:

MTU              Average trans rate   Fastest trans rate
----             -----------------    -----------------
3924             8.920 Mb/s           9.31 Mb/s

If the user's data length is higher than this, there is a 1/100th
of a second wait between packets.  The larger the user's data length,
the more the data gets chopped up into 1/100th of a second intervals.

It looks as though user data that can't fit into two Ethernet packets
is queued until the next time-slice on a 100 Hz system. This severely
hurts sustained data performance. The performance with a single
64k data buffer is abysmal. If it gets chopped up into 2048 byte
blocks in user-space, it's reasonable.

Both machines are Dual Pentium 600 MHz machines with identical eepro100
Ethernet boards. I substituted, LANCE (Hewlett Packard), and 3COM boards
(3c59x) with essentially no change.

Does this point out a problem? Or should user-mode code be required
to chop up data lengths to something more "reasonable" for the kernel?
If so, how does the user know what "reasonable" is?

Cheers,
* Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://www.veryComputer.com/
Please read the FAQ at  http://www.veryComputer.com/

 
 
 

Linux 2.4.1 network (socket) performance

Post by Richard B. Johnso » Sun, 25 Feb 2001 01:10:03


Hello,
The problem with awful socket performance on 2.4.1 has been discovered
and fixed by Manfred Spraul. Here is some info, and his patch:


> Could you post your results to linux-kernel?
> My mail from this morning wasn't accurate enough, you patched the wrong
> line. Sorry.

Yep. The patch you sent was a little broken. I tried to fix it, but
ended up pathing the wrong line.

Quote:

> I've attached the 2 patches that should cure your problems.
> patch-new is integrated into the -ac series, and it's a bugfix - simple
> unix socket sends eat into memory reserved for atomic allocs.
> patch-new2 is the other variant, it just deletes the fallback system.

--- linux/net/core/sock.c       Fri Dec 29 23:07:24 2000

                                /* The buffer get won't block, or use the atomic queue.
                                * It does produce annoying no free page messages still.
                                */
-                               skb = alloc_skb(size, GFP_BUFFER);
+                               skb = alloc_skb(size, sk->allocation & (~__GFP_WAIT));
                                if (skb)
                                        break;
                                try_size = fallback;

Cheers,
* Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://www.veryComputer.com/
Please read the FAQ at  http://www.veryComputer.com/

 
 
 

Linux 2.4.1 network (socket) performance

Post by David S. Mille » Wed, 28 Feb 2001 09:00:05


Quote:Richard B. Johnson writes:

 > > unix socket sends eat into memory reserved for atomic allocs.

OK (Manfred is being quoted here, to be clear).

I'm still talking with Alexey about how to fix this, I might just
prefer killing this fallback mechanism of skb_alloc_send_skb then
make AF_UNIX act just like everyone else.

This was always just a performance hack, and one which makes less
and less sense as time goes on.

Later,
David S. Miller

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

Linux 2.4.1 network (socket) performance

Post by Alan Co » Wed, 28 Feb 2001 21:00:05


Quote:> I'm still talking with Alexey about how to fix this, I might just
> prefer killing this fallback mechanism of skb_alloc_send_skb then
> make AF_UNIX act just like everyone else.

> This was always just a performance hack, and one which makes less
> and less sense as time goes on.

When I first did the hack it was worth about 20% performance, but at the time
the fallback allocation and initial allocations didnt eat into pools in a
problematic way

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

1. poor Linux socket performance?

I'd like to hear advice or ideas that anyone has to offer here.. I'm
stumped about what could be going on..

I'm using NCSA httpd 1.3 (compiled without DNS lookups) on a 486/66 Linux
box with 24MB RAM.. just lately, the httpd seems to be suffering from some
serious performance problems.  I've been watching it for about 3 weeks
through various 1.2.x kernels and the problem occurs pretty consistently.
It seems to get worse after the machine has been up for a few days, but
it will also happen shortly after a reboot.  Strangely, the only thing
that seems to be affected are http accesses.. the machine always has
plenty of free memory, the load stays very low, and it's quite snappy
about everything BUT answering the accesses at all times.

The server is used to some serious wear.. I observed about 6 weeks of
100K-150K accesses/day without seeing any real server slowdowns or
problems.  During that time, an average busy hour was 6800 hits (nearly
two each second for the whole hour) and there were certainly even busier
ones!  I was singing the praises of Linux! (it's come a long way!)

For the last 3 or 4 weeks, however, I've seen what appears to be an
inability for anything to open HTTP connections to the server.  Since the
httpd hasn't changed any, I'm led to believe that Linux is somehow at
fault but that doesn't sound right either.  If I'm keeping a tail on the
httpd access_log file to watch them as they hit, I've seen a "pause" for
5-30 seconds where nothing happens.. fairly regularly when it starts to
get heavy (I was watching it happen this afternoon at only 3600 hits in
the hour).  There have been many occasions where nothing is able to make a
http connection to the server (even from localhost) even though the
machine seems to be in perfect condition.  Sometimes the access will make
it through after a few seconds of waiting.. most annoyingly, most of the
connections that don't respond right away will never respond.  I've also
seen it "hang" at the end of a transfer (particularly on small ones)..
Netscape shows 100% of 5K completed but it's still sitting (waiting on
something to close?) there.. I have to stop it and try again..

I used to see "netstat -nt | wc -l" numbers regularly in the 200-300
range.. now the server seems to be "crushed" without even 200 connections
listed there.  If I'm logged in working during a "pause" in the log file
(or when I'm trying to access from localhost and not being able to make a
HTTP connection), I wouldn't ever notice that anything was going wrong
with httpd.. otherwise, the machine performance remains tip-top.

Please copy any followups to me in email.. thanks in advance..

kevin
--

 (System Administrator) | Paranoia offers low cost accounts to those in need.
 Finger for PGP 2.3 Key |  <a href="http://www.paranoia.com/">The Server</a>

2. Q: Support for 3com 10/100MBit EISA (!!) NIC

3. Performance IPC vs socket UDP/IP on Linux

4. Insecure PATH? What's that mean?

5. Any good books on "Linux Socket prog", also on "Linux Network prog"?

6. Where manpages COMPLETE package?

7. Binding 2 network adapters to achieve better network performance

8. DHCPD

9. Network sockets... without network!

10. Bad network/disk/IO performance of Linux compared to FreeBSD

11. Questions about NFS server and network performance issues with Linux - long

12. Performance issues with PCMCIA network device on a Gentoo-Linux machine

13. network performance data, linux 2.4 on moto 8240 @ 266MHz