nfs stalls - solved, but why?

nfs stalls - solved, but why?

Post by Frank Benoi » Sun, 22 Jun 2003 21:37:27



I had this problem:

-I can mount the NFS filesystem on the client.
-I can navigate (cd, ls) through the directories on the NFS
filesystem
-When I try to either read or write a file (greater a few bytes)
on the NFS then the process (not the system) hangs indefinitly
(well I only waited for 5 minutes...)
-The only way to resume operation on the hung console is to,
using another virtual console, do a kill -9 on the rpciod
process running on the client.

The solution or workaround was:
mount option wsize=rsize=1024

But what means that? Why does it work now?

Frank

 
 
 

nfs stalls - solved, but why?

Post by Alle » Mon, 23 Jun 2003 00:47:12



> I had this problem:

> -I can mount the NFS filesystem on the client.
> -I can navigate (cd, ls) through the directories on the NFS
> filesystem
> -When I try to either read or write a file (greater a few bytes)
> on the NFS then the process (not the system) hangs indefinitly
> (well I only waited for 5 minutes...)
> -The only way to resume operation on the hung console is to,
> using another virtual console, do a kill -9 on the rpciod
> process running on the client.

> The solution or workaround was:
> mount option wsize=rsize=1024

> But what means that? Why does it work now?

> Frank

Something about MTAs. perhaps.

 
 
 

nfs stalls - solved, but why?

Post by Horst Knobloc » Mon, 23 Jun 2003 01:56:42



Quote:> I had this problem:

> -I can mount the NFS filesystem on the client.
> -I can navigate (cd, ls) through the directories on the NFS
> filesystem
> -When I try to either read or write a file (greater a few bytes)
> on the NFS then the process (not the system) hangs indefinitly
> (well I only waited for 5 minutes...)
> -The only way to resume operation on the hung console is to,
> using another virtual console, do a kill -9 on the rpciod
> process running on the client.

> The solution or workaround was:
> mount option wsize=rsize=1024

> But what means that? Why does it work now?

I guess client and/or server generate too large IP packets
which are dropped because of a smaller MTU size along the
path.

Check your MTU (Maximum Transfer Unit) sizes via ifconfig.
All boxes attached to the same (LAN) network should use
the same MTU size. If you communicate to your NFS server
via one or more routers, make sure that Path MTU discovery
works properly.

For Path MTU discovery the routers generate an ICMP Destination
Unreachable because fragmentation is necessary but not allowed
error for the too large packets. You need to make sure that
these ICMP error packets make it through along the path from
NFS client to server and vice versa, so that client and server
know to send smaller packets. Also make sure that path
discovery is enabled on the Linux box:
echo 0 > /proc/sys/net/ipv4/ip_no_pmtu_disc
(This should be enabled by default but how knows)

My guess is that you have a firewall along the path which
drops these ICMP packets. So too large packets are sent
which does not fit a MTU size along the path and therefore
the packets are dropped.

HTH

Ciao, Horst
--
?When pings go wrong (It hurts me too)? E.Clapton/E.James/P.Tscharn

 
 
 

nfs stalls - solved, but why?

Post by Frank Benoi » Mon, 23 Jun 2003 02:17:14


Ethereal shows that the server is really sending the packets, and repeats
them. But the client seems not to "hear" them. So, your assumption seems to be
right.

I checked this, but ifconfig displayed the same values. The computer are
directly connected by a hub. strange

*** Server ***********************************
eth0      Protokoll:Ethernet  Hardware Adresse xxxxxxxxxxxxxxxxxxxx
          inet Adresse:192.168.0.24  Bcast:192.168.0.255  Maske:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1672910 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1698915 errors:0 dropped:0 overruns:0 carrier:0
          Kollisionen:14784 Sendewarteschlangenl?nge:100
          RX bytes:1743808716 (1.6 GiB)  TX bytes:365109935 (348.1 MiB)
          Interrupt:9 Basisadresse:0xb400 Speicher:e4800000-e4800038

lo        Protokoll:Lokale Schleife
          inet Adresse:127.0.0.1  Maske:255.0.0.0
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:1090 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1090 errors:0 dropped:0 overruns:0 carrier:0
          Kollisionen:0 Sendewarteschlangenl?nge:0
          RX bytes:66567 (65.0 KiB)  TX bytes:66567 (65.0 KiB)

*** client ***********************************
eth0      Link encap:Ethernet  HWaddr xxxxxxxxxxxxxxxxxxxxxx
          inet addr:192.168.0.126  Bcast:192.168.0.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:3696 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2853 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100
          Interrupt:5 Base address:0x300

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:14 errors:0 dropped:0 overruns:0 frame:0
          TX packets:14 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0

 
 
 

nfs stalls - solved, but why?

Post by Frank Benoi » Mon, 23 Jun 2003 02:20:08


Quote:> Something about MTAs. perhaps.

Sorry, what is a MTA?
 
 
 

nfs stalls - solved, but why?

Post by Alle » Mon, 23 Jun 2003 02:43:20



>> Something about MTAs. perhaps.

> Sorry, what is a MTA?

It is an MTU at 6:00 AM.
 
 
 

nfs stalls - solved, but why?

Post by Horst Knobloc » Mon, 23 Jun 2003 03:41:59



Quote:> Ethereal shows that the server is really sending the packets, and repeats
> them. But the client seems not to "hear" them. So, your assumption seems
> to be right.

> I checked this, but ifconfig displayed the same values. The computer are
> directly connected by a hub. strange

However it is no MTU related problem then. If you have a
packet filter running on server or client try to switch it
off for testing purposes.

Ciao, Horst
--
?When pings go wrong (It hurts me too)? E.Clapton/E.James/P.Tscharn

 
 
 

nfs stalls - solved, but why?

Post by James Knot » Mon, 23 Jun 2003 06:36:43



>> Something about MTAs. perhaps.

> Sorry, what is a MTA?

Wasn't there a Kingston Trio song about the MTA?  ;-)

--

Fundamentalism is fundamentally wrong.


james.knott.

 
 
 

1. Stall-man is Stall-ing

Today, in the Amazing Adventures of Stall-man, Stallman decried Microsofts effort to move to a Trusted Computing platform as treacherous computing.  Palladium, the name Microsoft gave to the project, would bring the same trusted computing architecture used by the military and intelligence agencies to your everyday desktop PC.  Yeah, you heard that right.  Microsoft is trying to bring military-level security to your desktop PC.  Now everybody wants that, right?  Well, not Richard M. Stallman.

For those of you who dont know, Richard M. Stallman is a lonely, homely, pot-smoking atheist who has spent the last 20 years of his life fighting the free enterprise system with his own tax-exempt organization, the Free Software Foundation, Inc., and a project he calls "GNU" (pronounced g-NEW).  As a BA graduate in Physics, he continuously demonstrates that he does not grasp basic economic concepts, like the cost of research and development, by trying to convince everyone to give their software away for free.

****************************************

Read the rest of the article at:
http://www.worldtechtribune.com/worldtechtribune/asparticles/buzz/bz0...

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

2. Ideas on solaris Sendmail 8.9.3 MC

3. Why does my mouse pointer stall?

4. 3c509 on 2.1.12[45]?

5. netscape always stalls during download??? why???

6. Executing external command and redirecting its output

7. Why do FTP downloads stall under RH6.2 linux

8. Access SCSI Scanner via network?

9. RSHD starts but stalls: WHY?

10. apache "stalls" delivering page, why?

11. PPP is frequently "stalled" - why?

12. Why PPP stalls ?

13. ftp over modem line stalls at 100% : why???