Hi,
I'm working on my back up regimen and when I boot up in my rescue
environment the networking eventually dies and I am getting 100s
of thousands of collisions. Over 500,000 when trying to do
network operations. When I perform the same operation in my
Linux box native environment it works fine and I rarely get
collisions.
I have two Linux boxes. The server has the tape drive and is a
Mandrake 2.2.17-21mdk kernel. The client machine is a Red Hat
2.4.18-mywin4lin-6 kernel. What I do is backup the client to the
server tape drive when I have the full blown Red Hat distribution
running and that works great. Next I boot in the rescue
environment so I can try the restore. The restore starts but
eventually it quits responding. If I kill the backup process at
that point the client console works normal, it isn't locked up.
At that point the server will not respond to pings.
I tried booting in a rescue environment by using both Mindi
and/or Linux rescue from the first Red Hat CD. I can transfer
some data when in the rescue environment but eventually the
operation simply stops. When I tried to ftp in the rescue
environment, instead of seeing steady progress it seems to
transfer in bursts.
I know collisions are the result of two devices trying to talk at
the same time. What I can't figure out is why this happens only
in the rescue environment. Plus I don't know why there is
contention at all given that I only have two active devices. I
don't know that the problem is the collisions per se or just that
that is the symptom I notice.
Both ethernet nics are 10/100 and the hub is 10/100. Both are
operating in 100Mb mode. The server nic is an Intel Pro 10/100.
The client is a VIA (on an Amptron MB) nic. The hub is Netgear.
The machines are connected by RJ-45 network cables that are less
than 5 ft long each. There are a total of four machines on this
"network" with only these two actively being used. The others
are plugged in and running but with no or minimal network
activity.
I can rlogin to the server and that works but the response is
sluggish while doing the restore.
I tried doing a ping while transferring a file and the results
look like this:
64 bytes from 192.168.1.1: icmp_seq=35 ttl=255 time=3080 ms
64 bytes from 192.168.1.1: icmp_seq=36 ttl=255 time=2081 ms
64 bytes from 192.168.1.1: icmp_seq=37 ttl=255 time=1081 ms
64 bytes from 192.168.1.1: icmp_seq=38 ttl=255 time=81.1 ms
64 bytes from 192.168.1.1: icmp_seq=39 ttl=255 time=4.05 ms
64 bytes from 192.168.1.1: icmp_seq=40 ttl=255 time=0.203 ms
64 bytes from 192.168.1.1: icmp_seq=41 ttl=255 time=0.215 ms
64 bytes from 192.168.1.1: icmp_seq=42 ttl=255 time=2066 ms
64 bytes from 192.168.1.1: icmp_seq=43 ttl=255 time=1065 ms
64 bytes from 192.168.1.1: icmp_seq=44 ttl=255 time=67.0 ms
64 bytes from 192.168.1.1: icmp_seq=45 ttl=255 time=2.57 ms
64 bytes from 192.168.1.1: icmp_seq=46 ttl=255 time=1.53 ms
64 bytes from 192.168.1.1: icmp_seq=47 ttl=255 time=0.348 ms
64 bytes from 192.168.1.1: icmp_seq=48 ttl=255 time=2.48 ms
64 bytes from 192.168.1.1: icmp_seq=49 ttl=255 time=0.316 ms
--- 192.168.1.1 ping statistics ---
63 packets transmitted, 61 received, 3% loss, time 62285ms
rtt min/avg/max/mdev = 0.185/915.554/3180.369/1045.451 ms, pipe 4
I manually configure my devices on the client and here is how my
eth0 device looks on the client after I set it up:
eth0 Link encap:Ethernet HWaddr 00:07:95:44:4D:E8
inet addr:192.168.1.3 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:6 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
RX bytes:882 (882.0 b) TX bytes:0 (0.0 b)
Interrupt:5 Base address:0xdc00
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
Here is the eth0 from the server.
eth0 Link encap:Ethernet HWaddr 00:D0:B7:AF:2C:14
inet addr:192.168.1.1 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1020171 errors:0 dropped:0 overruns:0 frame:1
TX packets:1088683 errors:0 dropped:0 overruns:0 carrier:0
collisions:575159 txqueuelen:100
Interrupt:11 Base address:0xd800
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:3924 Metric:1
RX packets:90251 errors:0 dropped:0 overruns:0 frame:0
TX packets:90251 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
I've tried running the diags from
http://www.scyld.com/diag/index.html and as far as I can tell
things are good.
I haven't tried putting another nic into one of the available PCI
slots, mainly because it involves futzing with an otherwise
working hardware and kernel configuration.
I would take any advice or troubleshooting suggestions.
Thanks,
Gary