2.2.6 net performance and panic with 1000's of sockets open

2.2.6 net performance and panic with 1000's of sockets open

Post by Mark Powe » Sat, 18 Jul 1998 04:00:00



Been doing some stress testing of a web caching solutions, to see which
is the better of squid-1.NOVM.22 running under 2.2.6-RELEASE or Novell's
offering FastCache. My workstation and the cache box, being tested, both
have Intel EtherExpress cards running at 100M into the same switch, so
they get as near the full bandwidth as possible. I put FreeBSD on
one disk and Netware 4.11 on another and just swapped the drives around
at boot time to get the desired OS. Thus any hardware difference was
ruled out.
  Knocked together a few small C programs. I noticed that the max
throughput of the FreeBSD box was a shade over 6M/sec whilst the Netware
box could manage up to nearly 9M/sec. I thought Netware may have the
edge, but by that much? I tried bumping send/recvspace up to 64K one
both boxes, but that only had a very slight performance increase.
Is there anything else I could try?
  Using another test program I found that FreeBSD would panic. The program
opened 1000 sockets to the squid box and requested a file from the web
through each. The machine would panic with a page fault everytime I ran  
this program. The squid box would display something like "out of mbufs
increase maxusers" and then panic a second or two later.  Sometimes my  
workstation, which was running the test program, would also come down at  
the same time. I noticed if I killed off all the unnecesary processes on  
my workstation it wouldn't panic. A process waking up is finding some of
it's memory missing?
  Both my workstation and the box under test are running up to date
2.2.6-RELEASE with maxusers set to 256.
  I followed the kernel debugging information from the handbook, but
gdb still moaned about not finding some memory. Anyway here it is:  

/sys/compile/SQUID.TAG # gdb -k kernel.debug /var/crash/vmcore.3
GDB is free software and you are welcome to distribute copies of it
 under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.16 (i386-unknown-freebsd),
Copyright 1996 Free Software Foundation, Inc...
IdlePTD 1ff000
current pcb at 1e674c
panic: page fault
#0  boot (howto=789038597) at ../../kern/kern_shutdown.c:266
266                                     dumppcb.pcb_cr3 = rcr3();
(kgdb) bt
#0  boot (howto=789038597) at ../../kern/kern_shutdown.c:266
Cannot access memory at address 0xefbffde0.
(kgdb) where
#0  boot (howto=789038597) at ../../kern/kern_shutdown.c:266
#1  0x20726f74 in ?? ()
#2  0xb91774 in ?? ()
Cannot access memory at address 0xfe8308.
(kgdb)

BTW The Netware box didn't fail under any of these stress conditions.

TIA

--
Mark Powell - System Administrator (UNIX) - Clifford Whitworth Building
A.I.S., University of Salford, Salford, Manchester, UK.
Tel:    +44 161 295 5936        Fax:    +44 161 295 5888

NO SPAM please: Spell salford correctly to reply to me.

 
 
 

2.2.6 net performance and panic with 1000's of sockets open

Post by Matt Dill » Sat, 18 Jul 1998 04:00:00



:>Been doing some stress testing of a web caching solutions, to see which
:>is the better of squid-1.NOVM.22 running under 2.2.6-RELEASE or Novell's
:>offering FastCache. My workstation and the cache box, being tested, both
Quote:>have Intel EtherExpress cards running at 100M into the same switch, so

:..
:>(kgdb)
:>
:>BTW The Netware box didn't fail under any of these stress conditions.
:>
:>TIA

    You need to tune the machine resources.  Are you familiar with how to
    compile up a FreeBSD kernel?  You need to configure a kernel
    with the 'maxusers' option set relatively high and should probably
    also configure extra network mbuf's if you intend to handle 1000+
    sockets, and there are other optimizations you can do.  But at the
    very least use:

        maxusers        128
        options         "NMBCLUSTERS=8192"      #network mbufs

    The 2.2.6 kernel doesn't panic with the right message when it runs out
    of network mbuf's.  The -current kernel does.  Also, using a large TCP
    window size is not necessarily a good solution with so many sockets.
    1000 sockets at 64K rx and tx tcp buffers == 128MBytes of ram.  You may
    be starving the machine's ram... try using something smaller like
    32K or even 24K.

    You also need to determine *where* the machine is bogging down.  While
    you are running your test, do a 'vmstat 1' in another xterm on the server
    machine.  Is the machine running out of cpu?  Is it paging?  Or are the
    disks saturated?  Use 'netstat -m' to determine how much memory is being
    used for the kernel's networking, it may be too much.

    If the machine is paging the memory configuration needs to be tuned...
    your tcp buffers are probably too big or you are using symetric tcp
    buffers (and wasting a lot of space) when you really want a large transmit
    buffer and a small receive buffer.

    If the disks are saturated there are a number of things you can do.  First,
    you can mount the partitions with the 'noatime' flag to stop atime inode
    writes to the disk.  Second, you can run FreeBSD-current instead of
    FreeBSD-2.2.6 and install the softupdates stuff which should improve
    disk performance.  Third, tuning the memory will result in more memory
    available for disk caching, reducing disk activity.

    If the cpu is saturated, tuning depends on the situation.  You could try
    tuning the squid parameters.

    I would expect a Novell system to do pretty well since Novell's stuff
    is highly optimized for networking, but if you use Novell you get stuck
    into a proprietary and probably unnecessarily costly system... with
    some of the money you save you could simply purchase a slightly more
    powerful PC (if the problem can't be fixed with tuning) and run squid.

    Also check the network configuration... make sure both systems are running
    the network in the same mode.  With 100BaseT you have the option of a
    half or full duplex connection (assuming the switch supports it).

    Finally, be aware that it takes a rather large company to proxy enough
    traffic to actually fill a 100BaseT link... if the company is big enough
    to do that, it may be better to use several servers rather then just one
    so you have some fail-over options if a server fails.  For example, we
    have three proxy machines handling 'www.best.com' even though we only
    need one.

                                                -Matt

:>--
:>Mark Powell - System Administrator (UNIX) - Clifford Whitworth Building
:>A.I.S., University of Salford, Salford, Manchester, UK.
--
    Matthew Dillon  Engineering, HiWay Technologies, Inc. & BEST Internet
                    Communications


 
 
 

2.2.6 net performance and panic with 1000's of sockets open

Post by Joe Gre » Sat, 18 Jul 1998 04:00:00



:Been doing some stress testing of a web caching solutions, to see which
:is the better of squid-1.NOVM.22 running under 2.2.6-RELEASE or Novell's
:offering FastCache. My workstation and the cache box, being tested, both
:have Intel EtherExpress cards running at 100M into the same switch, so
:they get as near the full bandwidth as possible. I put FreeBSD on
:one disk and Netware 4.11 on another and just swapped the drives around
:at boot time to get the desired OS. Thus any hardware difference was
:ruled out.
:  Knocked together a few small C programs. I noticed that the max
:throughput of the FreeBSD box was a shade over 6M/sec whilst the Netware
:box could manage up to nearly 9M/sec. I thought Netware may have the
:edge, but by that much? I tried bumping send/recvspace up to 64K one
:both boxes, but that only had a very slight performance increase.
:Is there anything else I could try?

Tune the TCP space down.  You're probably chewing up all sorts of kernel
memory with buffers that may not be doing you all that much good.

:  Using another test program I found that FreeBSD would panic. The program
:opened 1000 sockets to the squid box and requested a file from the web
:through each. The machine would panic with a page fault everytime I ran  
:this program. The squid box would display something like "out of mbufs
:increase maxusers" and then panic a second or two later.  Sometimes my  
:workstation, which was running the test program, would also come down at  
:the same time. I noticed if I killed off all the unnecesary processes on  
:my workstation it wouldn't panic. A process waking up is finding some of
:it's memory missing?

Yes, so do what it says...  increase maxusers.  Or, more specifically,
increase the maxusers option, but also increase NMBCLUSTERS as well.

Check out "netstat -m" to find out what you're using in terms of mbufs
and other stuff.

:BTW The Netware box didn't fail under any of these stress conditions.

Yeah, Netware is pretty solid, but has its own set of things that can really
mess it up.

I'd recommend building the FreeBSD kernel and tweaking it specifically for
Squid.

For a more serious solution, consider two smaller boxes and load balancing
between them.  That, of course, isn't a FreeBSD specific solution.  :-)

... JG

 
 
 

2.2.6 net performance and panic with 1000's of sockets open

Post by Mark Powe » Sun, 19 Jul 1998 04:00:00





>:>BTW The Netware box didn't fail under any of these stress conditions.
>:>TIA

>    You need to tune the machine resources.  Are you familiar with how to
>    compile up a FreeBSD kernel?  You need to configure a kernel

Yeah. I mentioned in the article that I was running a newly cvsupped tree
with maxusers set to 256 on both server and test workstation.
I know I can tune parameters more, but my concern is that the machine is
panicing, whereas Netware just hangs in there. That shouldn't happen?

Quote:>    with the 'maxusers' option set relatively high and should probably
>    also configure extra network mbuf's if you intend to handle 1000+
>    sockets, and there are other optimizations you can do.  But at the
>    very least use:

>    maxusers        128
>    options         "NMBCLUSTERS=8192"      #network mbufs

Not tried the later. Will give it a go.

Quote:>    The 2.2.6 kernel doesn't panic with the right message when it runs out
>    of network mbuf's.  The -current kernel does.  Also, using a large TCP
>    window size is not necessarily a good solution with so many sockets.
>    1000 sockets at 64K rx and tx tcp buffers == 128MBytes of ram.  You may
>    be starving the machine's ram... try using something smaller like
>    32K or even 24K.

Well I have noticed the same behaviour with the default 16k window size.
I onlt bumped them up to 64k to see if I got a raw performance increase.
It was hardly noticaable. Not worth the extra memory.

Quote:>    You also need to determine *where* the machine is bogging down.  While
>    you are running your test, do a 'vmstat 1' in another xterm on the server
>    machine.  Is the machine running out of cpu?  Is it paging?  Or are the
>    disks saturated?  Use 'netstat -m' to determine how much memory is being
>    used for the kernel's networking, it may be too much.

Will try, but it does seem to go down pretty quickly.

Quote:>    If the machine is paging the memory configuration needs to be tuned...
>    your tcp buffers are probably too big or you are using symetric tcp
>    buffers (and wasting a lot of space) when you really want a large transmit
>    buffer and a small receive buffer.

Possibly. I'll knock together a small test server, instead of full blown squid,
to see if I can emulate this behaviour.

Quote:>    If the disks are saturated there are a number of things you can do.  First,
>    you can mount the partitions with the 'noatime' flag to stop atime inode
>    writes to the disk.  Second, you can run FreeBSD-current instead of
>    FreeBSD-2.2.6 and install the softupdates stuff which should improve
>    disk performance.

I've always shyed away from current after having compile and panic problems
a couple of years ago. I'm wary of using it on our primary caching machines.
Is it stable enough for production use?

Quote:> Third, tuning the memory will result in more memory
>    available for disk caching, reducing disk activity.

>    If the cpu is saturated, tuning depends on the situation.  You could try
>    tuning the squid parameters.
>    I would expect a Novell system to do pretty well since Novell's stuff
>    is highly optimized for networking, but if you use Novell you get stuck
>    into a proprietary and probably unnecessarily costly system... with
>    some of the money you save you could simply purchase a slightly more
>    powerful PC (if the problem can't be fixed with tuning) and run squid.

We have a Netware base already and get the OS for free though as we're
the UK academic Novell support centre.

Quote:>    Also check the network configuration... make sure both systems are running
>    the network in the same mode.  With 100BaseT you have the option of a
>    half or full duplex connection (assuming the switch supports it).

Both FreeBSD's are, but I take your point. I'll make sure Netware is too.

Quote:>    Finally, be aware that it takes a rather large company to proxy enough
>    traffic to actually fill a 100BaseT link... if the company is big enough
>    to do that, it may be better to use several servers rather then just one
>    so you have some fail-over options if a server fails.  For example, we
>    have three proxy machines handling 'www.best.com' even though we only
>    need one.

I know. I'm testing to find which is the better caching system which my
boss wants a report on. However, unless the Novell solution is far, far
better than FreeBSD I won't recommend it's use as FreeBSD has many, many
advantages , we're all aware of :)

Many thanks for all your suggestions.

--
Mark Powell - System Administrator (UNIX) - Clifford Whitworth Building
A.I.S., University of Salford, Salford, Manchester, UK.
Tel:    +44 161 295 5936        Fax:    +44 161 295 5888

NO SPAM please: Spell salford correctly to reply to me.

 
 
 

2.2.6 net performance and panic with 1000's of sockets open

Post by Matt Dill » Sun, 19 Jul 1998 04:00:00




:..
:.......
:>
:>Yeah. I mentioned in the article that I was running a newly cvsupped tree
:>with maxusers set to 256 on both server and test workstation.
:>I know I can tune parameters more, but my concern is that the machine is
:>panicing, whereas Netware just hangs in there. That shouldn't happen?

    Yah, I don't like that behavior either... it's a minor bug, though.
    Bumping up NMBCLUSTERS will fix it.

:>>    You also need to determine *where* the machine is bogging down.  While
:>>    you are running your test, do a 'vmstat 1' in another xterm on the server
:>>    machine.  Is the machine running out of cpu?  Is it paging?  Or are the
:>>    disks saturated?  Use 'netstat -m' to determine how much memory is being
:>>    used for the kernel's networking, it may be too much.
:>
:>Will try, but it does seem to go down pretty quickly.

    After you fix NMBCLUSTERS and you stop crashing the poor beast :-)

:>>    writes to the disk.  Second, you can run FreeBSD-current instead of
:>>    FreeBSD-2.2.6 and install the softupdates stuff which should improve
:>>    disk performance.
:>
:>I've always shyed away from current after having compile and panic problems
:>a couple of years ago. I'm wary of using it on our primary caching machines.
:>Is it stable enough for production use?

    I think so.  I've been life-testing -current on our new NNTP box and,
    so far, it's just as stable as -stable is.

:>--
:>Mark Powell - System Administrator (UNIX) - Clifford Whitworth Building
:>A.I.S., University of Salford, Salford, Manchester, UK.

                                        -Matt

--
    Matthew Dillon  Engineering, HiWay Technologies, Inc. & BEST Internet
                    Communications

 
 
 

2.2.6 net performance and panic with 1000's of sockets open

Post by Rick Jon » Mon, 20 Jul 1998 04:00:00


A few random thoughts...

If connections are opening and closing during your benchmark tests you
might want to make sure that both systems are using the same TIME_WAIT
state length and that both systems are actually tracking the expected
number of TIME_WAIT connections for the connection rate you have.

Also make sure that you aren't overflowing transmit queues someplace

rick jones
--
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to email, or post, but please do not do both...
my email address is raj in the cup.hp.com domain...

 
 
 

2.2.6 net performance and panic with 1000's of sockets open

Post by Matt Dill » Mon, 20 Jul 1998 04:00:00



:>A few random thoughts...
:>
:>If connections are opening and closing during your benchmark tests you
:>might want to make sure that both systems are using the same TIME_WAIT
:>state length and that both systems are actually tracking the expected
:>number of TIME_WAIT connections for the connection rate you have.
:>
:>Also make sure that you aren't overflowing transmit queues someplace
:>
:>rick jones
:>--
:>these opinions are mine, all mine; HP might not want them anyway... :)
:>feel free to email, or post, but please do not do both...
:>my email address is raj in the cup.hp.com domain...

    Hmm.. I don't think this would be a serious concern, but it would
    be a good idea to check the network activity in general... make
    sure there aren't too many collisions and that there aren't any
    receive or transmit errors.  Certainly FreeBSD and Novell will
    be running different TCP stacks.  FreeBSD might require more cpu,
    though, since it employs a well-optimized but somewhat more generic
    TCP stack then novell's heavily optimized code.

    The biggest thing we found at BEST was that our per-machine hardware
    and maintenance costs went through the floor (i.e. became very low) once
    we started installing rack mount PC's running FreeBSD.  And our
    software costs went to $0 on the bulk web and satellite servers.  If you
    consider the same situation but with a commercial OS rather then a
    free OS, software costs can easily exceed hardware costs and maintenance
    costs with something like, say, a farm of windows NT boxes, would go right
    through the roof.

                                        -Matt

--
    Matthew Dillon  Engineering, HiWay Technologies, Inc. & BEST Internet
                    Communications

 
 
 

2.2.6 net performance and panic with 1000's of sockets open

Post by Mark Powe » Sat, 25 Jul 1998 04:00:00






>>    Also check the network configuration... make sure both systems are running
>>    the network in the same mode.  With 100BaseT you have the option of a
>>    half or full duplex connection (assuming the switch supports it).

>Both FreeBSD's are, but I take your point. I'll make sure Netware is too.

Couldn't tell if the Netware was before, as the adapter was auto-negotiating.
However, I've now made sure it's in half-duplex mode, but it hasn't made a
difference, so I assume it was negotiating that before. Currently:

FreeBSD 2.2.7-STABLE    6.1M/sec peak
Netware 4.11            8.4M/sec peak

Putting both the FreeBSD boxes in full-duplex mode, doesn't actually make
any difference to the thru-put.

With regards to the panicing. After adding:

options         "NMBCLUSTERS=8192"
options         MSIZE="256"

and doing a rebuild, the kernel is stable now. I've not been able to bring
it down :)

Thanks for the help.

--
Mark Powell - System Administrator (UNIX) - Clifford Whitworth Building
A.I.S., University of Salford, Salford, Manchester, UK.
Tel:    +44 161 295 5936        Fax:    +44 161 295 5888

NO SPAM please: Spell salford correctly to reply to me.

 
 
 

2.2.6 net performance and panic with 1000's of sockets open

Post by Steinar Ha » Mon, 27 Jul 1998 04:00:00


[Mark Powell]

|   However, I've now made sure it's in half-duplex mode, but it hasn't made a
|   difference, so I assume it was negotiating that before. Currently:
|  
|   FreeBSD 2.2.7-STABLE        6.1M/sec peak
|   Netware 4.11                8.4M/sec peak

Note that the networking performance of FreeBSD is more than enough to
*saturate* a 100 Mbps Ethernet. I did this more than a year ago running
ttcp and NetPerf, with a lowly P-133 on the receicving end.

So if you're only seeing 6.1 MByte/s peak, you're measuring something
different from raw networking performance.


 
 
 

2.2.6 net performance and panic with 1000's of sockets open

Post by Mark Powe » Tue, 28 Jul 1998 04:00:00




>[Mark Powell]

>|   However, I've now made sure it's in half-duplex mode, but it hasn't made a
>|   difference, so I assume it was negotiating that before. Currently:
>|  
>|   FreeBSD 2.2.7-STABLE    6.1M/sec peak
>|   Netware 4.11            8.4M/sec peak

>Note that the networking performance of FreeBSD is more than enough to
>*saturate* a 100 Mbps Ethernet. I did this more than a year ago running
>ttcp and NetPerf, with a lowly P-133 on the receicving end.

I thought I had the extensions on, but with netperf I can't get more
than ~64Mbits/sec.

net.inet.tcp.rfc1323: 1
net.inet.tcp.rfc1644: 1

Quote:>So if you're only seeing 6.1 MByte/s peak, you're measuring something
>different from raw networking performance.

Yeah, the performance of the web caching software. Looks like the squid
stuff is pretty poorly optimised?

--
Mark Powell - System Administrator (UNIX) - Clifford Whitworth Building
A.I.S., University of Salford, Salford, Manchester, UK.
Tel:    +44 161 295 5936        Fax:    +44 161 295 5888

NO SPAM please: Spell salford correctly to reply to me.

 
 
 

2.2.6 net performance and panic with 1000's of sockets open

Post by Matt Dill » Tue, 28 Jul 1998 04:00:00


:In article <6pi59u$fs...@plato.salford.ac.uk>,
:Mark Powell <m...@nospam.salford.ac.uk> wrote:

:>In article <6pftnt.ku...@verdi.nethelp.no>,
:>
:>I thought I had the extensions on, but with netperf I can't get more
:>than ~64Mbits/sec.
:>
:>net.inet.tcp.rfc1323: 1
:>net.inet.tcp.rfc1644: 1
:>
:>>So if you're only seeing 6.1 MByte/s peak, you're measuring something
:>>different from raw networking performance.
:>
:>Yeah, the performance of the web caching software. Looks like the squid
:>stuff is pretty poorly optimised?
:>
:>--
:>Mark Powell - System Administrator (UNIX) - Clifford Whitworth Building
:>A.I.S., University of Salford, Salford, Manchester, UK.

    just doing a simple rcp test I can get 10MBytes/sec over a 100BaseTX
    (full duplex) link.

    There are a lot of factors that can slow a transfer down.  If you are
    transfering from a disk file, for example, you will be limited to what
    the disk can do.  If you are doing encryption (i.e. scp or ssh) you will
    be limited to 2-4 MBytes/sec depending on the cpu.  If the TCP window
    is too small the datarate will be limited.  If the ethernet has CRC
    errors on it the datarate will be limited (due to TCP backoff).  If the
    ethernet is half duplex, the packet rate will limit the datarate due
    to transmit collisions.

                                                -Matt

tick# route -n change news2 -recvpipe 65536 -sendpipe 65536
tick:/home/dillon> dd if=/dev/zero bs=1m count=64 | rsh news2.best.com "cat >/dev/null"
64+0 records in
64+0 records out
67108864 bytes transferred in 7.341502 secs (9141027 bytes/sec)
tick:/home/dillon>

            input        (Total)           output
   packets  errs      bytes    packets  errs      bytes colls
       564     0     838369        254     0      21308     0
         4     0        330          0     0        347     0
         3     0        276          0     0        264     0
         3     0        265          0     0        172     0
         6     0        492          0     0        441     0
         8     0        677          0     0        618     0
      2528     0    3803293       1280     0      94699     0
      6712     0   10155280       3429     0     248016     0
      6703     0   10134550       3302     0     247991     0
      6669     0   10085671       3301     0     246749     0
      6683     0   10110028       3429     0     246752     0

tick# route -n change news2 -recvpipe 32768 -sendpipe 32768

tick:/home/dillon> dd if=/dev/zero bs=1m count=64 | rsh news2.best.com "cat >/dev/null"
64+0 records in
64+0 records out
67108864 bytes transferred in 7.376906 secs (9097156 bytes/sec)
tick:/home/dillon>

            input        (Total)           output
   packets  errs      bytes    packets  errs      bytes colls
         4     0        266        121     0        260     0
         3     0        194          0     0        210     0
         3     0        213          0     0        429     0
         5     0        366          0     0        448     0
      5267     0    7948955       2550     0     196227     0
      6587     0    9960433       3302     0     243125     0
      6570     0    9935580       3302     0     240788     0
      6400     0    9683263       3174     0     236896     0
      6589     0    9965610       3302     0     243897     0
      6432     0    9731306       3173     0     238028     0
      6441     0    9738284       3302     0     238154     0
      2389     0    3600577       1143     0      88937     0
         2     0        184          0     0        180     0
         8     0        759          0     0        959     0

route -n change news2 -recvpipe 16384 -sendpipe 16384

tick:/home/dillon> dd if=/dev/zero bs=1m count=64 | rsh news2.best.com "cat >/dev/null"
64+0 records in
64+0 records out
67108864 bytes transferred in 7.990206 secs (8398890 bytes/sec)
tick:/home/dillon>

         5     0        378          0     0        420     0
      2067     0    3101023       1019     0      77806     0
      5987     0    9054733       3048     0     221639     0
      6087     0    9209565       3047     0     225443     0
      6131     0    9275499       3048     0     227136     0
      6015     0    9100118       3048     0     222554     0
      6157     0    9313601       3046     0     227978     0
      6152     0    9305580       3047     0     227690     0
      6026     0    9111970       3047     0     223296     0
      2046     0    3091724       1016     0      75960     0
         8     0        759          0     0        858     0
         5     0        429          0     0        795     0

route -n change news2 -recvpipe 8192 -sendpipe 8192
tick:/home/dillon> dd if=/dev/zero bs=1m count=64 | rsh news2.best.com "cat >/dev/null"
64+0 records in
64+0 records out
67108864 bytes transferred in 16.246575 secs (4130647 bytes/sec)
tick:/home/dillon>

       247     0      37432        594     0     574317    27
       282     0      43395        756     0     832127    30
       367     0      44727        997     0    1220553    53
      2146     0    2853020       1372     0     753482    47
      3182     0    4544723       1661     0     437981    33
      3085     0    4306343       2091     0     801213   213
      3078     0    4445923       1885     0     602718   124
      3073     0    4248823       1899     0     924249   110
      3181     0    4548246       1906     0     567960   136

--
    Matthew Dillon  Engineering, HiWay Technologies, Inc. & BEST Internet
                    Communications
    <dil...@best.net> (Please include original email in any response)

 
 
 

2.2.6 net performance and panic with 1000's of sockets open

Post by Rick Jon » Wed, 29 Jul 1998 04:00:00




: >Note that the networking performance of FreeBSD is more than enough
: >to *saturate* a 100 Mbps Ethernet. I did this more than a year ago
: >running ttcp and NetPerf, with a lowly P-133 on the receicving end.

: I thought I had the extensions on, but with netperf I can't get more
: than ~64Mbits/sec.

: net.inet.tcp.rfc1323: 1
: net.inet.tcp.rfc1644: 1

I think Steinar meant ttcp the benchmark, not T/TCP the TCP protocol
extensions. As for netperf only getting 64 Mbit/s, it would help if
you could do a cut and paste of your command lines so we can see the
parameters used in the neteprf test.

: Yeah, the performance of the web caching software. Looks like the
: squid stuff is pretty poorly optimised?

Not sure if it will be at all germane, but you might look at:

   ftp://ftp.cup.hp.com/dist/networking/briefs/

rick jones
--
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to email, or post, but please do not do both...
my email address is raj in the cup.hp.com domain...

 
 
 

2.2.6 net performance and panic with 1000's of sockets open

Post by Steinar Ha » Wed, 29 Jul 1998 04:00:00


[Rick Jones]

|   : I thought I had the extensions on, but with netperf I can't get more
|   : than ~64Mbits/sec.
|  
|   : net.inet.tcp.rfc1323: 1
|   : net.inet.tcp.rfc1644: 1
|  
|   I think Steinar meant ttcp the benchmark, not T/TCP the TCP protocol
|   extensions.

Yes. Since FreeBSD is able to saturate a 100 Mbps Ethernet, the RFC 1323
and RFC 1644 extensions will actually give you slightly smaller through-
put, because your net TCP payload on Ethernet is 1440 bytes, not 1460.

(This is assuming you don't *need* the 1323/1644 extensions, of course.)


 
 
 

2.2.6 net performance and panic with 1000's of sockets open

Post by Mark Powe » Thu, 13 Aug 1998 04:00:00



>    There are a lot of factors that can slow a transfer down.  If you are
>    transfering from a disk file, for example, you will be limited to what
>    the disk can do.  If you are doing encryption (i.e. scp or ssh) you will
>    be limited to 2-4 MBytes/sec depending on the cpu.  If the TCP window
>    is too small the datarate will be limited.  If the ethernet has CRC
>    errors on it the datarate will be limited (due to TCP backoff).  If the
>    ethernet is half duplex, the packet rate will limit the datarate due
>    to transmit collisions.

Yeah, I realise this. However, as I said, the workstation and server
are both plugged into the same 100TX switch. The workstation is running
2.2.7-STABLE and the server runs either 2.2.7-STABLE or Netware 4.11,
depending on which disk it boots from. Thus each server OS has exactly
the same hardware and network traffic.

Quote:>tick# route -n change news2 -recvpipe 65536 -sendpipe 65536
>tick:/home/dillon> dd if=/dev/zero bs=1m count=64 | rsh news2.best.com "cat >/dev/null"
>64+0 records in
>64+0 records out
>67108864 bytes transferred in 7.341502 secs (9141027 bytes/sec)

I try this and get no more the ~4.2MBs. The workstation and the server
are both P166. This is with everything killed on the machines except
the necessary processes. Maybe the CPU's aren't up to it, under 227?

BTW 227 was cvsupped and compiled this morning.

--
Mark Powell - System Administrator (UNIX) - Clifford Whitworth Building
A.I.S., University of Salford, Salford, Manchester, UK.
Tel:    +44 161 295 5936        Fax:    +44 161 295 5888

NO SPAM please: Spell salford correctly to reply to me.

 
 
 

2.2.6 net performance and panic with 1000's of sockets open

Post by Mark Powe » Thu, 13 Aug 1998 04:00:00




>: I thought I had the extensions on, but with netperf I can't get more
>: than ~64Mbits/sec.

>: net.inet.tcp.rfc1323: 1
>: net.inet.tcp.rfc1644: 1

>I think Steinar meant ttcp the benchmark, not T/TCP the TCP protocol
>extensions. As for netperf only getting 64 Mbit/s, it would help if
>you could do a cut and paste of your command lines so we can see the
>parameters used in the neteprf test.

Didn't realise there was much to netperf. Start on server:

$ netserver -P 9999

On client:

$ netperf -H <server> -p 9999

If I use:

$ route -n change <server> -recvpipe 65536 -sendpipe 65536

On the client, I've seen 73Mbit/s.
If I do a UDP_STREAM I get 95.8Mbit/s.

Quote:>: Yeah, the performance of the web caching software. Looks like the
>: squid stuff is pretty poorly optimised?

>Not sure if it will be at all germane, but you might look at:

>   ftp://ftp.cup.hp.com/dist/networking/briefs/

Most of it not applicable, but I may try some squid/kernel profiling
to see what it's doing.
Cheers.

--
Mark Powell - System Administrator (UNIX) - Clifford Whitworth Building
A.I.S., University of Salford, Salford, Manchester, UK.
Tel:    +44 161 295 5936        Fax:    +44 161 295 5888

NO SPAM please: Spell salford correctly to reply to me.

 
 
 

1. Can't open a broadcast socket because there are too many open files?

I am using RPC write an application that will perform broadcasting to hosts
in my local domain.  When I set my server running on my local machine and
tried running the client, I received an error message something like:
    RPC:cannot open a broadcast socket because there are too many open
    files.

Has anyone ever experienced such an error?  If so, how did you correct this
problem?

-Bill

=============================================================================
Bill Thomason


US Mail: P.O. Box 144
         Metairie, LA 70004
=============================================================================

2. AIR QAccess VLB Controller Card - HELP HELP HELP !

3. SPARCserver 1000 panics when booting Solaris 2.3

4. device manager?

5. Linux Tutorial?

6. ipchains 1000 forward rules performance??

7. Netscape Commerce Server on port 80 or 443?

8. Upgrading to a faster NET .. 100/1000, Gigabite ?

9. Solaris 8 disk & database performance question on Ultra 10 vs. Sun Blade 1000

10. Latest kernel LMBench performance results.(with HZ=1000)

11. Disk I/O killing my RAID/1000 performance, need I/O utilities

12. Q: NIC IntelPRO/1000 T (82544GC) PXE + Net Install + Redhat 7.3