Sun GEM card looses TX on x86 32bit PCI

Sun GEM card looses TX on x86 32bit PCI

Post by Beezl » Tue, 12 Mar 2002 05:40:14



Hi David,

Unfortunately not. I've just applied these changes and recompiled, but
I'm suffering exactly the same problem.

This is what I have this time when the card has stopped receiving;

monkey:/home/andy# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:03:BA:04:5B:D7 =20
          inet addr:10.0.0.12  Bcast:10.0.0.255  Mask:255.255.255.0
          inet6 addr: fe80::203:baff:fe04:5bd7/10 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:48508 errors:0 dropped:1 overruns:1 frame:68
          TX packets:49362 errors:0 dropped:0 overruns:0 carrier:1
          collisions:2 txqueuelen:100=20
          RX bytes:61058494 (58.2 MiB)  TX bytes:61988220 (59.1 MiB)
          Interrupt:5 Base address:0x8400=20

Cheers,

Beezly


>=20
> Let me know if this makes things any better:
>=20

  signature.asc
< 1K Download
 
 
 

Sun GEM card looses TX on x86 32bit PCI

Post by David S. Mille » Tue, 12 Mar 2002 09:50:09


What do the kernel logs say when the link is established?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

Sun GEM card looses TX on x86 32bit PCI

Post by David S. Mille » Tue, 12 Mar 2002 10:50:10



   Date: 10 Mar 2002 20:36:59 +0000

   Unfortunately not. I've just applied these changes and recompiled, but
   I'm suffering exactly the same problem.

   This is what I have this time when the card has stopped receiving;

Please add this patch on top of what you have.  I still want the
kernel logs from the failure case that I asked for an hour ago
though, I really need to know if systems seeing this problem have
Pause enabled on the link or not.

--- drivers/net/sungem.c.~1~    Sun Mar 10 00:12:34 2002

                        gp->dev->name, rxmac_stat);

        if (rxmac_stat & MAC_RXSTAT_OFLW) {
-               u32 smac = readl(gp->regs + MAC_SMACHINE);
+               int limit;

-               printk(KERN_ERR "%s: RX MAC fifo overflow smac[%08x].\n",
-                      dev->name, smac);
                gp->net_stats.rx_over_errors++;
                gp->net_stats.rx_fifo_errors++;

-               if (((smac >> 24) & 0x7) == 0x7) {
-                       /* Due to a bug, the chip is hung in this case
-                        * and a full reset is necessary.
-                        */
+               /* Reset the RX MAC then re-enable it. */
+               writel(MAC_RXRST_CMD, gp->regs + MAC_RXRST);
+               for (limit = 0; limit < 5000; limit++) {
+                       if (!(readl(gp->regs + MAC_RXRST) & MAC_RXRST_CMD))
+                               break;
+                       udelay(10);
+               }
+               if (limit == 5000) {
+                       printk(KERN_ERR "%s: RX MAC will not reset, resetting whole "
+                              "chip.\n", dev->name);
                        ret = 1;
+                       goto out;
                }
+
+               writel(0, gp->regs + MAC_RXCFG);
+               for (limit = 0; limit < 5000; limit++) {
+                       if (!(readl(gp->regs + MAC_RXCFG) & MAC_RXCFG_ENAB))
+                               break;
+                       udelay(10);
+               }
+               if (limit == 5000) {
+                       printk(KERN_ERR "%s: RX MAC will not disable, resetting whole "
+                              "chip.\n", dev->name);
+                       ret = 1;
+                       goto out;
+               }
+
+               writel(gp->mac_rx_cfg | MAC_RXCFG_ENAB, gp->regs + MAC_RXCFG);
        }


        if (rxmac_stat & MAC_RXSTAT_LCE)
                gp->net_stats.rx_length_errors += 0x10000;

+out:
        /* We do not track MAC_RXSTAT_FCE and MAC_RXSTAT_VCE
         * events.

 static void gem_init_mac(struct gem *gp)
 {
        unsigned char *e = &gp->dev->dev_addr[0];
-       u32 rxcfg;

        if (gp->pdev->vendor == PCI_VENDOR_ID_SUN &&

        writel(0, gp->regs + MAC_AF21MSK);
        writel(0, gp->regs + MAC_AF0MSK);

-       rxcfg = gem_setup_multicast(gp);
+       gp->mac_rx_cfg = gem_setup_multicast(gp);

        writel(0, gp->regs + MAC_NCOLL);

         * them once a link is established.
         */
        writel(0, gp->regs + MAC_TXCFG);
-       writel(rxcfg, gp->regs + MAC_RXCFG);
+       writel(gp->mac_rx_cfg, gp->regs + MAC_RXCFG);
        writel(0, gp->regs + MAC_MCCFG);
        writel(0, gp->regs + MAC_XIFCFG);

        netif_stop_queue(dev);

        rxcfg = readl(gp->regs + MAC_RXCFG);
-       rxcfg_new = gem_setup_multicast(gp);
+       gp->mac_rx_cfg = rxcfg_new = gem_setup_multicast(gp);

        writel(rxcfg & ~MAC_RXCFG_ENAB, gp->regs + MAC_RXCFG);
        while (readl(gp->regs + MAC_RXCFG) & MAC_RXCFG_ENAB) {
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

Sun GEM card looses TX on x86 32bit PCI

Post by David S. Mille » Tue, 12 Mar 2002 11:10:09


I inadvertantly left out this part of the patch, sorry.

--- drivers/net/sungem.h.~1~    Wed Jan 23 07:40:02 2002

        int                     mii_phy_addr;
        int                     gigabit_capable;

+       u32                     mac_rx_cfg;
+
        /* Autoneg & PHY control */
        int                     link_cntl;
        int                     link_advertise;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

Sun GEM card looses TX on x86 32bit PCI

Post by Beezl » Tue, 12 Mar 2002 17:30:10


Hi David,


> What do the kernel logs say when the link is established?

I haven't had chance to apply your most recent patch yet (I have to go
to work!), but without the patch...

Mar 10 20:26:48 monkey kernel: sungem.c:v0.96 11/17/01 David S. Miller

Mar 10 20:26:48 monkey kernel: PCI: Enabling device 00:0a.0 (0014 ->
0016)
Mar 10 20:26:48 monkey kernel: PCI: Found IRQ 5 for device 00:0a.0
Mar 10 20:26:48 monkey kernel: PCI: Sharing IRQ 5 with 00:0b.1
Mar 10 20:26:48 monkey kernel: eth0: Sun GEM (PCI) 10/100/1000BaseT
Ethernet 00:00:00:00:00:00
Mar 10 20:26:48 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 10 20:26:48 monkey kernel: eth0: Pause is disabled
Mar 10 20:26:48 monkey kernel: eth0: PCS AutoNEG complete.
Mar 10 20:26:48 monkey kernel: eth0: PCS link is now up.
Mar 10 20:26:48 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 10 20:26:48 monkey kernel: eth0: Pause is disabled
Mar 10 20:26:48 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 10 20:26:48 monkey kernel: eth0: Pause is disabled
Mar 10 20:26:48 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 10 20:26:48 monkey kernel: eth0: Pause is disabled
Mar 10 20:26:48 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.

<snip - it does this until the card decides to hang RX>

Mar 10 20:28:53 monkey kernel: eth0: Pause is disabled
Mar 10 20:28:54 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 10 20:28:54 monkey kernel: eth0: Pause is disabled
Mar 10 20:28:56 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 10 20:28:56 monkey kernel: eth0: Pause is disabled
Mar 10 20:28:57 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 10 20:28:57 monkey kernel: eth0: Pause is disabled
Mar 10 20:28:57 monkey kernel: eth0: RX MAC fifo overflow
smac[03910440].
Mar 10 20:28:58 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 10 20:28:58 monkey kernel: eth0: Pause is disabled
Mar 10 20:28:59 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.

I'll apply the patch today and send you the logs back.

Cheers,

Beezly

  signature.asc
< 1K Download
 
 
 

Sun GEM card looses TX on x86 32bit PCI

Post by Beezl » Tue, 12 Mar 2002 17:40:06


With both patches applied I get the same effect :(

On Mon, 2002-03-11 at 01:58, David S. Miller wrote:

> I inadvertantly left out this part of the patch, sorry.

> --- drivers/net/sungem.h.~1~       Wed Jan 23 07:40:02 2002
> +++ drivers/net/sungem.h   Sun Mar 10 17:22:07 2002
> @@ -986,6 +986,8 @@
>    int                     mii_phy_addr;
>    int                     gigabit_capable;

> +  u32                     mac_rx_cfg;
> +
>    /* Autoneg & PHY control */
>    int                     link_cntl;
>    int                     link_advertise;

Here's the relevant output in /var/log/kern.log;

Mar 11 08:22:53 monkey kernel: sungem.c:v0.96 11/17/01 David S. Miller
(da...@redhat.com)
Mar 11 08:22:53 monkey kernel: PCI: Found IRQ 5 for device 00:0a.0
Mar 11 08:22:53 monkey kernel: PCI: Sharing IRQ 5 with 00:0b.1
Mar 11 08:22:53 monkey kernel: eth0: Sun GEM (PCI) 10/100/1000BaseT
Ethernet 00:00:00:00:00:00
Mar 11 08:22:56 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:22:56 monkey kernel: eth0: Pause is disabled
Mar 11 08:22:56 monkey kernel: eth0: PCS AutoNEG complete.
Mar 11 08:22:56 monkey kernel: eth0: PCS link is now up.
Mar 11 08:22:57 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:22:57 monkey kernel: eth0: Pause is disabled
Mar 11 08:22:58 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:22:58 monkey kernel: eth0: Pause is disabled
Mar 11 08:22:59 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:22:59 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:00 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:00 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:02 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:02 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:03 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:03 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:04 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:04 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:04 monkey kernel: eth0: no IPv6 routers present
Mar 11 08:23:05 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:05 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:06 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:06 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:08 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:08 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:09 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:09 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:10 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:10 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:11 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:11 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:12 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:12 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:14 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:14 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:15 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:15 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:16 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:16 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:17 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:17 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:18 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:18 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:20 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:20 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:21 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:21 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:22 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:22 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:23 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:23 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:24 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:24 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:26 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:26 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:27 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:27 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:28 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:28 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:29 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:29 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:30 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:30 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:32 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:32 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:33 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:33 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:34 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:34 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:35 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:35 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:36 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:36 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:38 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:38 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:39 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:39 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:40 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:40 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:41 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:41 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:42 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:42 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:44 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:44 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:45 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:45 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:46 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:46 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:47 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:47 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:48 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:48 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:50 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:50 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:51 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:51 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:52 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:52 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:53 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:53 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:54 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:54 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:56 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:56 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:57 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:57 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:58 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:58 monkey kernel: eth0: Pause is disabled
Mar 11 08:23:59 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:23:59 monkey kernel: eth0: Pause is disabled
Mar 11 08:24:00 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:24:00 monkey kernel: eth0: Pause is disabled
Mar 11 08:24:02 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:24:02 monkey kernel: eth0: Pause is disabled
Mar 11 08:24:03 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:24:03 monkey kernel: eth0: Pause is disabled
Mar 11 08:24:04 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:24:04 monkey kernel: eth0: Pause is disabled
Mar 11 08:24:05 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:24:05 monkey kernel: eth0: Pause is disabled
Mar 11 08:24:06 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:24:06 monkey kernel: eth0: Pause is disabled
Mar 11 08:24:08 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:24:08 monkey kernel: eth0: Pause is disabled
Mar 11 08:24:09 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:24:09 monkey kernel: eth0: Pause is disabled
Mar 11 08:24:10 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:24:10 monkey kernel: eth0: Pause is disabled
Mar 11 08:24:11 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:24:11 monkey kernel: eth0: Pause is disabled
Mar 11 08:24:12 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:24:12 monkey kernel: eth0: Pause is disabled
Mar 11 08:24:14 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:24:14 monkey kernel: eth0: Pause is disabled
Mar 11 08:24:15 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:24:15 monkey kernel: eth0: Pause is disabled
Mar 11 08:24:16 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:24:16 monkey kernel: eth0: Pause is disabled
Mar 11 08:24:17 monkey kernel: eth0: Link is up at 1000 Mbps,
full-duplex.
Mar 11 08:24:17 monkey kernel: eth0: Pause is disabled
Mar 11 08:24:18 monkey ...

read more »

  signature.asc
< 1K Download
 
 
 

Sun GEM card looses TX on x86 32bit PCI

Post by David S. Mille » Tue, 12 Mar 2002 17:50:07



   Date: 11 Mar 2002 08:29:48 +0000

   Mar 11 08:22:57 monkey kernel: eth0: Link is up at 1000 Mbps, full-duplex.
   Mar 11 08:22:57 monkey kernel: eth0: Pause is disabled

Your switch doesn't support XON/XOFF pause? :(
That is the root cause for the RX overflows...

PLEASE STICK AROUND RIGHT NOW, I have new patches for you to test
and we can avoid the 24 hour turn around time for debugging this
if you don't disappear on me. :-)))
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

Sun GEM card looses TX on x86 32bit PCI

Post by Beezl » Tue, 12 Mar 2002 21:30:08


Hi David,

I managed to run the latest patch, but it appears to Oops when the
overflow condition occurs. Sadly I was not able to get the output of the
oops... but it was at exactly the same time that I was run my "test"
which causes the RX to halt.


> Hi David,


> > What do the kernel logs say when the link is established?

> I haven't had chance to apply your most recent patch yet (I have to go
> to work!), but without the patch...

> Mar 10 20:26:48 monkey kernel: sungem.c:v0.96 11/17/01 David S. Miller

> Mar 10 20:26:48 monkey kernel: PCI: Enabling device 00:0a.0 (0014 ->
> 0016)
> Mar 10 20:26:48 monkey kernel: PCI: Found IRQ 5 for device 00:0a.0
> Mar 10 20:26:48 monkey kernel: PCI: Sharing IRQ 5 with 00:0b.1
> Mar 10 20:26:48 monkey kernel: eth0: Sun GEM (PCI) 10/100/1000BaseT
> Ethernet 00:00:00:00:00:00
> Mar 10 20:26:48 monkey kernel: eth0: Link is up at 1000 Mbps,
> full-duplex.
> Mar 10 20:26:48 monkey kernel: eth0: Pause is disabled
> Mar 10 20:26:48 monkey kernel: eth0: PCS AutoNEG complete.
> Mar 10 20:26:48 monkey kernel: eth0: PCS link is now up.
> Mar 10 20:26:48 monkey kernel: eth0: Link is up at 1000 Mbps,
> full-duplex.
> Mar 10 20:26:48 monkey kernel: eth0: Pause is disabled
> Mar 10 20:26:48 monkey kernel: eth0: Link is up at 1000 Mbps,
> full-duplex.
> Mar 10 20:26:48 monkey kernel: eth0: Pause is disabled
> Mar 10 20:26:48 monkey kernel: eth0: Link is up at 1000 Mbps,
> full-duplex.
> Mar 10 20:26:48 monkey kernel: eth0: Pause is disabled
> Mar 10 20:26:48 monkey kernel: eth0: Link is up at 1000 Mbps,
> full-duplex.

> <snip - it does this until the card decides to hang RX>

> Mar 10 20:28:53 monkey kernel: eth0: Pause is disabled
> Mar 10 20:28:54 monkey kernel: eth0: Link is up at 1000 Mbps,
> full-duplex.
> Mar 10 20:28:54 monkey kernel: eth0: Pause is disabled
> Mar 10 20:28:56 monkey kernel: eth0: Link is up at 1000 Mbps,
> full-duplex.
> Mar 10 20:28:56 monkey kernel: eth0: Pause is disabled
> Mar 10 20:28:57 monkey kernel: eth0: Link is up at 1000 Mbps,
> full-duplex.
> Mar 10 20:28:57 monkey kernel: eth0: Pause is disabled
> Mar 10 20:28:57 monkey kernel: eth0: RX MAC fifo overflow
> smac[03910440].
> Mar 10 20:28:58 monkey kernel: eth0: Link is up at 1000 Mbps,
> full-duplex.
> Mar 10 20:28:58 monkey kernel: eth0: Pause is disabled
> Mar 10 20:28:59 monkey kernel: eth0: Link is up at 1000 Mbps,
> full-duplex.

> I'll apply the patch today and send you the logs back.

> Cheers,

> Beezly

  signature.asc
< 1K Download
 
 
 

Sun GEM card looses TX on x86 32bit PCI

Post by David S. Mille » Tue, 12 Mar 2002 21:30:10



   Date: 11 Mar 2002 12:19:24 +0000

   I managed to run the latest patch, but it appears to Oops when the
   overflow condition occurs. Sadly I was not able to get the output of the
   oops... but it was at exactly the same time that I was run my "test"
   which causes the RX to halt.

Duh, this will fix it:

--- drivers/net/sungem.c.~1~    Mon Mar 11 04:18:58 2002

        }

        /* Second, disable RX DMA. */
-       writel(0, RXDMA_CFG);
+       writel(0, gp->regs + RXDMA_CFG);
        for (limit = 0; limit < 5000; limit++) {
                if (!(readl(gp->regs + RXDMA_CFG) & RXDMA_CFG_ENABLE))
                        break;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

Sun GEM card looses TX on x86 32bit PCI

Post by Beezl » Wed, 13 Mar 2002 03:50:07


Hi David,

Sorry it took so long for me to get back to you. Sadly it also hung with
this patch ;) I was unable to get an oops out of it (machine was
completely hosed and in X so I couldn't even note the oops on paper :(
).

Beezly



>    Date: 11 Mar 2002 12:19:24 +0000

>    I managed to run the latest patch, but it appears to Oops when the
>    overflow condition occurs. Sadly I was not able to get the output of the
>    oops... but it was at exactly the same time that I was run my "test"
>    which causes the RX to halt.

> Duh, this will fix it:

> --- drivers/net/sungem.c.~1~       Mon Mar 11 04:18:58 2002
> +++ drivers/net/sungem.c   Mon Mar 11 04:24:13 2002

>    }

>    /* Second, disable RX DMA. */
> -  writel(0, RXDMA_CFG);
> +  writel(0, gp->regs + RXDMA_CFG);
>    for (limit = 0; limit < 5000; limit++) {
>            if (!(readl(gp->regs + RXDMA_CFG) & RXDMA_CFG_ENABLE))
>                    break;

  signature.asc
< 1K Download
 
 
 

Sun GEM card looses TX on x86 32bit PCI

Post by David S. Mille » Wed, 13 Mar 2002 04:10:10



   Date: 11 Mar 2002 18:35:01 +0000

   Sorry it took so long for me to get back to you. Sadly it also hung with
   this patch ;) I was unable to get an oops out of it (machine was
   completely hosed and in X so I couldn't even note the oops on paper :(
   ).

So rerun the test not under X please?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

Sun GEM card looses TX on x86 32bit PCI

Post by Beezl » Wed, 13 Mar 2002 06:30:10


David,

I've been looking some more at what my changes did.... it might be best
to completely ignore them ;) I had no clue what I was doing!

Cheers,

Beezly


> Hi David,

> It seems I fubar'd. I recompiled the module and run it through the test
> again... no hang. It looks like I forgot to copy the new module into my
> /lib/modules/<blah>. Apologies for messing up there.

> Anyway... the new driver still drops packets after the initial RX
> overflow, so I had a poke around with it and I've seen some definate
> improvement by forcing the whole chip to reset when the RX overflows.

> My modifications to the driver are evil and I only intend them to be a
> test, but it helps to shed some extra light on what's going on.

> When the chip does a full reset I loose a whole load of packets, but I'm
> guessing this is normal :(

> Also, I can't remember where I read it, but the Extreme Summit 48 is
> supposed to support *receiving* the xon/xoff Pause stuff (I'm no expert
> in this area, so I could be talking complete twaddle!), no transmit
> capability though.

> Here's what I get out of the module when it resets (with my limit=5000
> mod);

> eth0: RX buffer overflowed - running rxmac_reset
> eth0: RX MAC resetting
> eth0: RX MAC *ONLY* reset
> eth0: RX MAC reset ok?
> eth0: RX MAC will not disable, resetting whole chip.
> eth0: PCS AutoNEG complete.
> eth0: PCS link is now up.

> Without the limit=5000, it appears that the module detects the RX
> section is "un-hung" when it isn't.

> Cheers,

> Beezly



> >    Date: 11 Mar 2002 18:35:01 +0000

> >    Sorry it took so long for me to get back to you. Sadly it also hung with
> >    this patch ;) I was unable to get an oops out of it (machine was
> >    completely hosed and in X so I couldn't even note the oops on paper :(
> >    ).

> > So rerun the test not under X please?

> ----

> --- sungem.c       Mon Mar 11 20:37:57 2002
> +++ sungem.c.testing       Mon Mar 11 20:31:12 2002

>    u64 desc_dma;
>    u32 val;

> +  printk(KERN_ERR "%s: RX MAC resetting\n", dev->name);
>    /* First, reset MAC RX. */
>    writel(gp->mac_rx_cfg & ~MAC_RXCFG_ENAB,
>           gp->regs + MAC_RXCFG);
> +  printk(KERN_ERR "%s: RX MAC *ONLY* reset\n", dev->name);
> +  
>    for (limit = 0; limit < 5000; limit++) {
> -          if (!(readl(gp->regs + MAC_RXCFG) & MAC_RXCFG_ENAB))
> +          if (!(readl(gp->regs + MAC_RXCFG) & MAC_RXCFG_ENAB)) {
> +                  printk(KERN_ERR "%s: RX MAC reset ok?\n", dev->name);
>                    break;
> +          }
>            udelay(10);
>    }
> +
> +  /* RX MAC reset doesn't appear to work so I force a whole reset */
> +  limit = 5000;
> +  
>    if (limit == 5000) {
>            printk(KERN_ERR "%s: RX MAC will not disable, resetting whole "
>                   "chip.\n", dev->name);

>                    break;
>            udelay(10);
>    }
> +
> +  limit=5000;
> +
>    if (limit == 5000) {
>            printk(KERN_ERR "%s: RX DMA will not disable, resetting whole "
>                   "chip.\n", dev->name);

>    if (rxmac_stat & MAC_RXSTAT_OFLW) {
>            gp->net_stats.rx_over_errors++;
>            gp->net_stats.rx_fifo_errors++;
> +          printk(KERN_DEBUG "%s: RX buffer overflowed - running rxmac_reset\n",
> +                  gp->dev->name);

>            ret = gem_rxmac_reset(gp);
>    }

  signature.asc
< 1K Download
 
 
 

Sun GEM card looses TX on x86 32bit PCI

Post by Beezl » Wed, 13 Mar 2002 08:00:16


Hi David,

Sorry about the mindless waffle earlier on - I'd just come in from work
and was suffering from brain death ;) I must remember to wait an hour or
so before touching a computer after I come in from work.

Ok, I've been fiddling around with the driver tonight and have managed
to get a little further by forcing the driver to do a full reset of the
chip when the RX buffer over flows. I achieved this by sticking a return
1; at the top of gem_rxmac_reset().

I'm guessing this isn't an "optimal" reset for the situation but so far
it's having /reasonable/ results (i.e. I don't have to bring the
interface up and down every 30 seconds!).

Here's the output of the ping;

monkey:/home/andy# ping -f -s 1472 shroom
PING shroom.beezly.org.uk (10.0.0.15) from 10.0.0.12 : 1472(1500) bytes
of
data.............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
--- shroom.beezly.org.uk ping statistics ---
366892 packets transmitted, 366288 received, 0% loss, time 343836ms
rtt min/avg/max/mdev = 0.445/0.471/5.377/0.038 ms, pipe 2, ipg/ewma
0.937/0.465 ms

Hope this helps,

Beezly


> David,

> I've been looking some more at what my changes did.... it might be best
> to completely ignore them ;) I had no clue what I was doing!

> Cheers,

> Beezly


> > Hi David,

> > It seems I fubar'd. I recompiled the module and run it through the test
> > again... no hang. It looks like I forgot to copy the new module into my
> > /lib/modules/<blah>. Apologies for messing up there.

> > Anyway... the new driver still drops packets after the initial RX
> > overflow, so I had a poke around with it and I've seen some definate
> > improvement by forcing the whole chip to reset when the RX overflows.

> > My modifications to the driver are evil and I only intend them to be a
> > test, but it helps to shed some extra light on what's going on.

> > When the chip does a full reset I loose a whole load of packets, but I'm
> > guessing this is normal :(

> > Also, I can't remember where I read it, but the Extreme Summit 48 is
> > supposed to support *receiving* the xon/xoff Pause stuff (I'm no expert
> > in this area, so I could be talking complete twaddle!), no transmit
> > capability though.

> > Here's what I get out of the module when it resets (with my limit=500
0
> > mod);

> > eth0: RX buffer overflowed - running rxmac_reset
> > eth0: RX MAC resetting
> > eth0: RX MAC *ONLY* reset
> > eth0: RX MAC reset ok?
> > eth0: RX MAC will not disable, resetting whole chip.
> > eth0: PCS AutoNEG complete.
> > eth0: PCS link is now up.

> > Without the limit=5000, it appears that the module detects the RX
> > section is "un-hung" when it isn't.

> > Cheers,

> > Beezly



> > >    Date: 11 Mar 2002 18:35:01 +0000

> > >    Sorry it took so long for me to get back to you. Sadly it also hung with
> > >    this patch ;) I was unable to get an oops out of it (machine was
> > >    completely hosed and in X so I couldn't even note the oops on paper :(
> > >    ).

> > > So rerun the test not under X please?

> > ----

> > --- sungem.c  Mon Mar 11 20:37:57 2002
> > +++ sungem.c.testing  Mon Mar 11 20:31:12 2002

> >       u64 desc_dma;
> >       u32 val;

> > +     printk(KERN_ERR "%s: RX MAC resetting\n", dev->name);
> >       /* First, reset MAC RX. */
> >       writel(gp->mac_rx_cfg & ~MAC_RXCFG_ENAB,
> >              gp->regs + MAC_RXCFG);
> > +     printk(KERN_ERR "%s: RX MAC *ONLY* reset\n", dev->name);
> > +    
> >       for (limit = 0; limit < 5000; limit++) {
> > -             if (!(readl(gp->regs + MAC_RXCFG) & MAC_RXCFG_ENAB))
> > +             if (!(readl(gp->regs + MAC_RXCFG) & MAC_RXCFG_ENAB)) {
> > +                     printk(KERN_ERR "%s: RX MAC reset ok?\n", dev->name);
> >                       break;
> > +             }
> >               udelay(10);
> >       }
> > +
> > +     /* RX MAC reset doesn't appear to work so I force a whole reset */
> > +     limit = 5000;
> > +    
> >       if (limit == 5000) {
> >               printk(KERN_ERR "%s: RX MAC will not disable, resetting whole "
> >                      "chip.\n", dev->name);

> >                       break;
> >               udelay(10);
> >       }
> > +
> > +     limit=5000;
> > +
> >       if (limit == 5000) {
> >               printk(KERN_ERR "%s: RX DMA will not disable, resetting whole "
> >                      "chip.\n", dev->name);

> >       if (rxmac_stat & MAC_RXSTAT_OFLW) {
> >               gp->net_stats.rx_over_errors++;
> >               gp->net_stats.rx_fifo_errors++;
> > +             printk(KERN_DEBUG "%s: RX buffer overflowed - running rxmac_reset\n",
> > +                     gp->dev->name);

> >               ret = gem_rxmac_reset(gp);
> >       }

  signature.asc
< 1K Download
 
 
 

Sun GEM card looses TX on x86 32bit PCI

Post by David S. Mille » Thu, 14 Mar 2002 02:40:09



   Date: 11 Mar 2002 22:51:42 +0000

   Ok, I've been fiddling around with the driver tonight and have managed
   to get a little further by forcing the driver to do a full reset of the
   chip when the RX buffer over flows. I achieved this by sticking a return
   1; at the top of gem_rxmac_reset().

   I'm guessing this isn't an "optimal" reset for the situation but so far
   it's having /reasonable/ results (i.e. I don't have to bring the
   interface up and down every 30 seconds!).
 ...  
   Hope this helps,

I'll follow up on this and figure out why my RX reset code
isn't working after I finish up some 2.5.x work.

But looking quickly I think I see what is wrong.  Please give
this a try (and remember to remove your hacks before testing
this :-):

--- drivers/net/sungem.c.~1~    Mon Mar 11 04:24:13 2002

                rxd->status_word = cpu_to_le64(RXDCTRL_FRESH(gp));
        }
+       gp->rx_new = gp->rx_old = 0;

        /* Now we must reprogram the rest of RX unit. */
        desc_dma = (u64) gp->gblock_dvma;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

Sun GEM card looses TX on x86 32bit PCI

Post by Beezl » Thu, 14 Mar 2002 05:40:08


Hi David,

This looks like it's working!

eth0      Link encap:Ethernet  HWaddr 00:03:BA:04:5B:D7
          inet addr:10.0.0.12  Bcast:10.0.0.255  Mask:255.255.255.0
          inet6 addr: fe80::203:baff:fe04:5bd7/10 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:623264 errors:0 dropped:4 overruns:4 frame:4
          TX packets:501679 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100
          RX bytes:864388119 (824.3 MiB)  TX bytes:719041874 (685.7 MiB)
          Interrupt:5 Base address:0x8400

There are a few dropped packets, and the card misses a few incoming
immediatly after a dropped packet although I guess this is it picking
itself up off the ground after RX has hung.

Also, regarding the missing MAC address on most architectures, I have it
on good authority that the last six digits of the MAC address are stored
in the Vital Product Data area on the PCI boards (presumably because the
first 6 are the "standard" SUN MAC prefix).

I had a quick fiddle around with pci_find_capability(<blah>,
PCI_CAP_ID_VPD), but it always returns NULL (i.e. no VPD). However, I've
also noticed that no-one appears to use that macro in any of the kernel
source. Is there another way to look at the VPD?

Many Thanks,

Beezly



>    Date: 11 Mar 2002 22:51:42 +0000

>    Ok, I've been fiddling around with the driver tonight and have managed
>    to get a little further by forcing the driver to do a full reset of the
>    chip when the RX buffer over flows. I achieved this by sticking a return
>    1; at the top of gem_rxmac_reset().

>    I'm guessing this isn't an "optimal" reset for the situation but so far
>    it's having /reasonable/ results (i.e. I don't have to bring the
>    interface up and down every 30 seconds!).
>  ...  
>    Hope this helps,

> I'll follow up on this and figure out why my RX reset code
> isn't working after I finish up some 2.5.x work.

> But looking quickly I think I see what is wrong.  Please give
> this a try (and remember to remove your hacks before testing
> this :-):

> --- drivers/net/sungem.c.~1~       Mon Mar 11 04:24:13 2002
> +++ drivers/net/sungem.c   Tue Mar 12 09:30:38 2002

>            rxd->status_word = cpu_to_le64(RXDCTRL_FRESH(gp));
>    }
> +  gp->rx_new = gp->rx_old = 0;

>    /* Now we must reprogram the rest of RX unit. */
>    desc_dma = (u64) gp->gblock_dvma;

  signature.asc
< 1K Download
 
 
 

1. SUN GEM on 32bit x86 looses connectivity

Hi,

after sorting out the non-existant MAC address problem, I've hit another
road block.

The kernel is 2.4.19-pre1

The GEM card connects fine (although the multiple "Link is up" messages
might be interesting);


PCI: Found IRQ 5 for device 00:0a.0
PCI: Sharing IRQ 5 with 00:0b.1
eth0: Sun GEM (PCI) 10/100/1000BaseT Ethernet 00:00:00:00:00:00=20
eth0: Link is up at 1000 Mbps, full-duplex.
eth0: PCS AutoNEG complete.
eth0: PCS link is now up.
eth0: Link is up at 1000 Mbps, full-duplex.
eth0: Link is up at 1000 Mbps, full-duplex.
eth0: Link is up at 1000 Mbps, full-duplex.
eth0: Link is up at 1000 Mbps, full-duplex.
eth0: Link is up at 1000 Mbps, full-duplex.
eth0: Link is up at 1000 Mbps, full-duplex.
eth0: Link is up at 1000 Mbps, full-duplex.

Everything appears to work fine; Ping works fine;

But after a short while (usually around a minute), the connection stops
working.=20

ifconfig shows this interesting output;

eth0      Link encap:Ethernet  HWaddr 00:10:5A:41:E6:14 =20
          inet addr:10.0.0.12  Bcast:10.0.0.255  Mask:255.255.255.0
          inet6 addr: fe80::210:5aff:fe41:e614/10 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:5244 errors:0 dropped:1 overruns:1 frame:1
          TX packets:5252 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100=20
          RX bytes:7833582 (7.4 MiB)  TX bytes:7812928 (7.4 MiB)
          Interrupt:5 Base address:0x8400=20

notice the RX - dropped:1 overruns:1 frame:1

I suspect that only RX capability is lost, although I haven't been able
to check this yet. I believe this because if I ping from the box with
the GE card in to another box (which is stable), once the GEM stops
working, TX packets increases, whilst RC packets does not.

I /think/ I can reproduce this problem quicker by doing a ping -f -s
1472 <somehost> from this host, although this is a purely qualitative
judgement.

The switch i am connecting to is an extreme summit 48 - which supports
the 802.1q VLAN protocol, if this has anything to do with the problem.

At the moment, my "workaround" is to ifdown the interface every 30
seconds, rmmod sungem and then modprobe sungem (auto-reconfiguring the
interface). This gets the interface going again, but means I loose
connectivity for about 2 seconds out of 30!

Any help gratefully appreciated,

Beezly

  signature.asc
< 1K Download

2. What's Wrong with This Shell Script?

3. Sun PCI cards in x86 box.

4. GCC or G++, who is the best?

5. Problem using EtherExpress PRO/100B TX (PCI) on Solaris 2.5 x86

6. Help needed on file system damage!

7. 3C509B-TX Network Card Config on Solaris x86

8. Sony PCG FX101

9. Installing Compex RL100-TX/PCI card

10. 3c905B TX PCI network card and 3.0 installation floppy

11. 3COM 3C905-TX Fast EtherLink XL PCI Card

12. RX errors with 3Com 3c905b-TX PCI card on RedHat6.0

13. UDB and 3C905B-TX PCI ethernet card