ksoftirqd causing severe network performance problems

ksoftirqd causing severe network performance problems

Post by Brendan Keesse » Tue, 09 Sep 2003 20:45:44



Hi,                                                                            

More than a week ago we replaced our old linux core routers (in a              
failover setup), with a new one. The old used 2 100 mbit NICs and worked        
very well, however we needed more than a 100 mbit throughput, so we            
replaced the setup with an almost identical setup based on two new              
servers with 2 1g NICs. At peek time it processes about 70 Mbits/sec of        
traffic and we use vlan's and use iptables, firewalling and DNAT of            
almost all the connections, the same as in the old setup.                      

At the end of last week, the new setup had network problems and what we
saw on  
the linux router was that the kernel threads ksoftirqd_CPU1 and
ksoftirqd_CPU0  
were using almost 100% of system time and the network throughput
collapsed.    
This happens every day once or twice but the first one seems reasonably        
predictable and happens when the network traffic raises from a constant        
throughput from 3 Mbit/sec to 46 Mbit/sec in 3 hours. At a rough 40
Mbit/sec    
the problem occures and a failover to the slave router solves the
problem. On  
the faulty server (previously master) the 100% CPU usage drops to almost
100%  
idle. When the backup is working fine, we can't use the faulty server
anymore  
for routing/firewalling because failing back to it again results in an
instant  
100% system time again. Rebooting the system helps.                            

Because the router was a new server (Dell 2650/Dual Xeon) and it had a
new      
network card (Gigabit Broadcom 5703, which we never used before in our
servers)
we thought that maybe the driver for the card was causing the problem.
After    
switching drivers and switching between kernel versions (2.4.21/2.4.22/        
2.4.18(which ran perfectly on our old router))we eventually choose to
replace  
the server with a dell 1650 which has 2 gigabit e1000 interfaces.
Different    
kernels and e1000 drivers resulted in the same problem again. Now we are        
running 2.4.18 with the 4.3.2 e1000 driver. I know we don't use the
newest      
kernel and newest driver but this doesn't seem to cause the problem
because we  
tested with other network cards, drivers and kernel versions.                  

The same problem still exists on the new server with totally different          
network cards. In the kernel logfiles we don't see any messages at all          
which are related to the problem.                                              

Here is some info which tells something about the server when the
network      
performance collapses and ksoftirqd_CPU0/ksoftirqd_CPU1 are using 99%
system    
time:                                                                          

routing cache (no. entries):                                                    

$ ip r ls cache | grep from | wc -l                                            
69323                                                                          

$ cat /proc/sys/net/ipv4/route/max_size                                        
131072                                                                          

$ cat /proc/sys/net/ipv4/route/gc_thresh                                        
8192                                                                            

We thought maybe for some reasom the routing cache is thrashing so we          
experimented with changing the max_size to 4 times the current value and        
raising the gc_thresh to 80% of that value and gc_elasticity to 32. But        
that didn't help and the same problem occured again.                            

info on ip conntrack (no. entries):                                            

$ cat /proc/net/ip_conntrack | wc -l                                            
126804                                                                          

The ip_conntrack module is loaded with the hashsize parameter:                  

ip_conntrack hashsize=2097152                                                  

To give you more input I turned on profiling. Read and clear profiling
info    
every 60 second. The kernel functions which use the most clockticks (top
10)    
during the problem are:                                                        

    31 handle_IRQ_event                           0.2500                        
    32 add_timer                                  0.1311                        
    34 net_rx_action                              0.0467                        
    35 __kfree_skb                                0.1136                        
    35 batch_entropy_store                        0.1944                        
    40 dev_queue_xmit                             0.0535                        
    50 ip_route_input                             0.1238                        
   676 __write_lock_failed                       21.1250                        
  2928 __read_lock_failed                       146.4000                        
  3620 default_idle                              69.6154                        

A few minutes before the problem occured (normal state):                        

    37 __kfree_skb                                0.1201                        
    49 net_rx_action                              0.0673                        
    50 dev_queue_xmit                             0.0668                        
    50 handle_IRQ_event                           0.4032                        
    54 ip_route_input                             0.1337                        
    56 schedule                                   0.0422                        
    68 __write_lock_failed                        2.1250                        
    73 batch_entropy_store                        0.4056                        
   742 __read_lock_failed                        37.1000                        
  8893 default_idle                             171.0192                        

I also monitored interrupts (/proc/interrupts) of eth0 and eth1 but the        
interrupts seem related with the throughput at that moment and no              
strange burst of interrupts occure:                                            

Before and during the problem occures the interrupts are about:                

eth0: 5000/s                                                                    
eth1: 4200/s                                                                    

Does anybody know why we have this problem and how to solve it. Or could        
you maybe tell me what more info is needed and how I can get it, to            
resolve the problem.                                                            

Thanks,                                                                        
Brendan Keessen                                                                

 
 
 

1. 3c90x "overruns" causing severe network slowdown

I've got an old PC (fast 486/slow pentium, unsure) that's had a new lease of life as our firewall, web server and SAMBA server.  Used 3com 3c905C-TX card as the LAN-facing NIC as it's
a fast 100Mb/s card and the hub and other PCs (Windoze boxes) are all 100Mb compatible.  Interface to rest of the teaching hospital and then on to the internet is via a bog-standard
NE2k compatible NIC that works fine (although only 10Mb/s of course but that's fine).

The setup basically works but saving files onto the box is crawlingly slow.  Rates of 500kb/s or less via SAMBA and even slower (50kb/s) via FTP (so I can be sure that the problem is
low-level networking, not Windoze or SAMBA).

Loading files from the server using any technology is very fast.

Looking at diagnostics the one thing that's obvious is that on "/sbin/ifconfig eth1"...

UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

RX packets:1149326 errors:0 dropped:0 overruns:20921 frame:0

TX packets:1239052 errors:0 dropped:0 overruns:0 carrier:1

collisions:0 txqueuelen:100

RX bytes:398014948 (379.5 Mb)  TX bytes:778774691 (742.6 Mb)

Interrupt:9 Base address:0xfe80

..the "OVERRUNS" figure on the "RX" section keeps going up, it's about 2% of the total packets received.  If each time a packet is lost then the whole sequence of TCP packets has to
start from the beginning again (a RESET packet is sent? not much of a TCP expert!) then that means that potentially a 100-300k file (typical for my users) might get quite difficult to
save - this is what the samba logs reflect with several attempts to save files.

Have tried both Donald Becker's 3c5x9.o and 3c90x.o drivers as provided pre-compiled in my SuSE 7.2 distro exactly the same problem with both.  (Also I've tried downloading the driver
from 3Com's site and compiling it - that one worked well for about 20 seconds then disconnected from the network completely!!)

My limited understanding might suggest a problem with either a hardware conflict (?) or the driver or the card.  The card was fine in the previous machine it was in (Win2k Pro box),
the hardware is PCI and "bus mastering" which I understand means that the network traffic is saved directly into RAM so IRQs are needed less and even on a reasonably busy machine there
should be no overruns??

Have searched FAQs and net to no avail.  Can anyone help?
(real email address is carl then the at sign then bookmanassociates.com, no spam please!!!)

Carl Peto

2. afio errors when using :(

3. 845GE Chipset severe performance problems

4. Seeking a Wireless [802.11b] WebCam

5. Mozilla Icon

6. Memory Mismatch? Performance problems (severe)

7. COBOL for Linux

8. Severe performance problem - paging?

9. rsh causing severe headache

10. ksoftirqd_CPU0 causing severe latency

11. nbd in 2.5.3 does not work, and can cause severe damage when read-write

12. ksoftirqd_CPU0 causing severe latency