System hang with heavy network traffic using rtl8139c

System hang with heavy network traffic using rtl8139c

Post by Stephan Braus » Wed, 26 Jun 2002 16:40:07



Hello,

maybe I had a similar problem. In my case, the rtl8139 chip reports
a negative buffer size and a following dev_alloc_skb() crashed my system.
The problem was caused by receive buffer overruns that occur if the CPU is not fast
enough to fetch all data. Please check if you see rtl8139-realted kernel messages
during the test.
I reported this problem to the realtek list some time ago, but, as far as I know,
it is not included in the test version 1.18 until know.

Here is my patch of rtl8129_rx():

                } else {
                        /* Malloc up new buffer, compatible with net-2e. */
                        /* Omit the four octet CRC from the length. */
                        struct sk_buff *skb;
                        int pkt_size = rx_size - 4;

+                       if(pkt_size<0)
+                       {
+                               if (tp->msg_level & NETIF_MSG_DRV)
+                                       printk(KERN_ERR"%s: Impossible packet length.\n",dev->name);
+                               tp->stats.rx_dropped++;
+                               rtl_hw_start(dev);
+                               break;
+                       }
+
                        skb = dev_alloc_skb(pkt_size + 2);
                        if (skb == NULL) {

Additionally, I think it is a good idea to increase the receive buffer size to the maximum by setting
RX_BUF_LEN_IDX from 2 to 3.
If you read older messages of the realtek list, you can find additional driver changes that are maybe
helpfull for you.

Stephan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

1. System hang with heavy network traffic using rtl8139c

Hi -

We have a multi-threaded load test program that will cause any x86
system configuration upon which we've tried that that contains at least
one rtl8139c to hard lock the system.  The exact same OS image and test
program has been shown to run on several exact same configurations by
simply replacing the rtl8139c device(s) with intel 82559ER's or Digital
21143TD's.  Please read on for more info.

I am involved in reliability testing of our linux-based network device
with which we plan to use realtek 8139c parts as its three main network
interfaces.  The system is architecturally very much like an x86 PC and
runs a standard linux 2.4.16 kernel with some well-used community
patches and/or driver modules.

We created a multi-threaded test program that generates bidirectional
traffic over all network interfaces simultaneously as fast as possible.
 Using this test, the system will lock up hard, taking from 90 seconds
to 1.5 weeks to do so.  Most failures occur, however, within the 2-5
hour range.  The lock-up is complete, in that not even sysrq works.
 More typical network traffic patterns do not cause lockups.  In
general, the test configurations have between 4 and 8 network devices
installed in the system, with most tests running with 8 devices.  No
test has used more than 4 devices of any single chipset.

We have ran our test on various hardware configurations to narrow the
problem boundaries, and the results are always the same: if the
configuration includes a RealTek 8139C part it will lock up.  If it
contains *no* 8139C it will not.

We can duplicate this behaviour on our hardware using either Geode GX1
processors or Transmeta Crusoe TM5800 processors.  We can duplicate it
on desktop Pentium MMX-based systems that contain none of our hardware.
 Realtek configurations lock up regardless of which driver we are using
(rtl8139 v1.18 / pci-scan v1.08 or 8139too v0.9.22).  With any
configuration, replacing the rtl parts with intel (82559ER) or digital
(21143TD) and the system will not fail.  The reverse has also been true.
Mixed-part environments fail even with a single rtl part installed, but
will not if there are no rtl parts.

I'm hoping to get some feedback from some of you out there who have
worked with this part more than I.  I'll be happy to post specifics as
requested.  I also appreciate thoughts on how to debug this problem
given that it fully locks up the system.

I have posted this message to LKML and to the realtek ML, so reply
accordingly.

All the best,
R. Steve McKown
Titanium Mirror, Inc.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

2. Recommended 100 Mbit ISA NIC

3. Computer freezes/hangs handling heavy network traffic

4. CERN httpd...

5. pppd 'hangs' sometimes on big downloads or heavy traffic - help!

6. 2.2.0-pre5 problem with ip_masq.c

7. Memory leak/kernel crash under heavy network traffic load

8. remote X with Xoftware Client ?

9. Network failure during heavy traffic

10. Machine crashes with heavy network traffic

11. Heavy network traffic causes Geode GX1 lockup

12. Urgent Question: I need to simulate heavy network traffic to test new server

13. System hang under heavy I/O