NFS io errors on transfer from system running 2.4 to system running 2.5

NFS io errors on transfer from system running 2.4 to system running 2.5

Post by Michael Fran » Wed, 04 Jun 2003 13:20:09



Speaking of weird errors:

For the last few months I encounter this:

When doing rsync or cp _from_ system running 2.4 _to_ system running 2.5
get Input/output error errors with random files.

- 2-5 > 2.4 is OK!
- SRebootting, swapping kernel causes the error on the system running 2.4    
- Fast machine > slow machine or slow machine > fast machine
  is no different
- Both systems run same distribution
- Encountered since 2.4.20 with about 2.5.64 (my first 2.5 kernel)

Example:

/temp contains a couple of *files

system mhfl2 is running 2.5.6x to 2.5.70-mm3 mounted on
/mnt/mhfl2.

On system running 2.4.20 or 2.4.21-x:
  while ((1)); do cp -f /temp/* /mnt/mhfl2/temp; done

cp: cannot create regular file `/mnt/mhfl2/temp/blah: Input/output error
cp: writing `/mnt/mhfl2/temp/blah: Input/output error

Errors are random, so the files change every run, sometimes there are no errors,
sometimes thre are 3 errors

Q? Any (in)compatibility reason or should I investigate further?

Regards
Michael Frank

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://www.veryComputer.com/
Please read the FAQ at  http://www.veryComputer.com/

 
 
 

NFS io errors on transfer from system running 2.4 to system running 2.5

Post by Michael Fran » Wed, 04 Jun 2003 14:50:10



Quote:> > When doing rsync or cp _from_ system running 2.4 _to_ system running 2.5
> > get Input/output error errors with random files.

> Do you use soft mounts?

Yes

Quote:

> If so, try hard instead. soft will fail, sooner or later.

I don't like hard mounts because these do not timeout.

Also, this does not explain why 2.5 > 2.4 (and 2.4 > 2.4) is OK - _never_ had any problem  

Regards
Michael

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

NFS io errors on transfer from system running 2.4 to system running 2.5

Post by Jakob Oestergaar » Wed, 04 Jun 2003 15:00:12




> > > When doing rsync or cp _from_ system running 2.4 _to_ system running 2.5
> > > get Input/output error errors with random files.

> > Do you use soft mounts?

> Yes

Then this is why you get the error

Quote:

> > If so, try hard instead. soft will fail, sooner or later.

> I don't like hard mounts because these do not timeout.

You get what you ask for, then:  timeouts

Quote:

> Also, this does not explain why 2.5 > 2.4 (and 2.4 > 2.4) is OK - _never_ had any problem  

Leave it running for a million years, and I'm sure a sporadic error will
show up in those two situations as well.

You just now found a case where sporadic errors show up more often.

soft-mount = fail upon (sporadic) error
hard-mount = retry (forever or until interrupted if used with 'intr') upon error

I always use hard,intr so that I can manually interrupt* jobs,
but also know that they do not randomly fail just because a few packets
get dropped on my network.  This seems to be the common setup, as far as
I know.

Cheers,

--
................................................................

:.........................: putrid forms of man                :
:   Jakob ?stergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://www.veryComputer.com/
Please read the FAQ at  http://www.veryComputer.com/

 
 
 

NFS io errors on transfer from system running 2.4 to system running 2.5

Post by Michael Fran » Wed, 04 Jun 2003 15:10:09



Quote:

> I always use hard,intr so that I can manually interrupt* jobs,
> but also know that they do not randomly fail just because a few packets
> get dropped on my network.  This seems to be the common setup, as far as
> I know.

Thank you,

I will try hard, intr

Regards
Michael

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://www.veryComputer.com/
Please read the FAQ at  http://www.veryComputer.com/

 
 
 

NFS io errors on transfer from system running 2.4 to system running 2.5

Post by Jakob Oestergaar » Wed, 04 Jun 2003 15:30:21




> > I always use hard,intr so that I can manually interrupt* jobs,
> > but also know that they do not randomly fail just because a few packets
> > get dropped on my network.  This seems to be the common setup, as far as
> > I know.

> Thank you,

> I will try hard, intr

no prob.

Please let the list know if it solves your problem or not - I'm sure
there are people who want to know if it doesn't, and if it does then the
solution will be in the archives for the next to find.

After all, I could be mistaken...  naaahh...   ;)

Cheers,

--
................................................................

:.........................: putrid forms of man                :
:   Jakob ?stergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://www.veryComputer.com/
Please read the FAQ at  http://www.veryComputer.com/

 
 
 

NFS io errors on transfer from system running 2.4 to system running 2.5

Post by Andrew Rya » Wed, 04 Jun 2003 16:50:12



> Speaking of weird errors:

> For the last few months I encounter this:

> When doing rsync or cp _from_ system running 2.4 _to_ system running 2.5
> get Input/output error errors with random files.

> - Encountered since 2.4.20 with about 2.5.64 (my first 2.5 kernel)

I am having a similar problem writing to NFS mounted non-linux system on
kernels past 2.4.20-pre3.  I get an input/output error while writing.  I
have sent email to Trond Myklebust (who made the changes between pre3 and
pre4).  And he said to switch to using the TCP protocol for mounts.  That
worked, but I should not have to do that because

1. It worked to 2.4.20pre3 without a problem
2. Other OSes such as FreeBSD do not have issues writing to other OSes using
UDP soft mounts.

To me, there is something wrong with the changes that went in in 2.4.20pre4,
it should work as it does in pre3 and/or other unix OSes such as FreeBSD.
We should not have to work around the problem with hard links or using TCP
instead of UDP.

Andy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

NFS io errors on transfer from system running 2.4 to system running 2.5

Post by Michael Fran » Wed, 04 Jun 2003 17:20:07




> > Speaking of weird errors:

> > For the last few months I encounter this:

> > When doing rsync or cp _from_ system running 2.4 _to_ system running 2.5
> > get Input/output error errors with random files.

> > - Encountered since 2.4.20 with about 2.5.64 (my first 2.5 kernel)

> I am having a similar problem writing to NFS mounted non-linux system on
> kernels past 2.4.20-pre3.  I get an input/output error while writing.  I
> have sent email to Trond Myklebust (who made the changes between pre3 and
> pre4).  And he said to switch to using the TCP protocol for mounts.  That
> worked, but I should not have to do that because

> 1. It worked to 2.4.20pre3 without a problem
> 2. Other OSes such as FreeBSD do not have issues writing to other OSes
> using UDP soft mounts.

> To me, there is something wrong with the changes that went in in
> 2.4.20pre4, it should work as it does in pre3 and/or other unix OSes such
> as FreeBSD. We should not have to work around the problem with hard links
> or using TCP instead of UDP.



too.

Even if I run them both against each other, the error only happens on 2.4 and
the frequency does not increase. This is not a simple timout problem.

I'll think it through and build some scripts so everone can reproduce it
and test it out.

Regards
Michael

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

NFS io errors on transfer from system running 2.4 to system running 2.5

Post by Trond Myklebus » Wed, 04 Jun 2003 18:20:08


     > To me, there is something wrong with the changes that went in
     > in 2.4.20pre4, it should work as it does in pre3 and/or other
     > unix OSes such as FreeBSD. We should not have to work around
     > the problem with hard links or using TCP instead of UDP.

Tough. 'soft' is not a priority of mine. It is a broken feature...

Cheers,
  Trond
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

NFS io errors on transfer from system running 2.4 to system running 2.5

Post by Michael Fran » Wed, 04 Jun 2003 18:40:21



Quote:

> Tough. 'soft' is not a priority of mine. It is a broken feature...

Well, a "hard" fact life is that it can't be "soft", at least we know where we stand...

Regards
Michael

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

NFS io errors on transfer from system running 2.4 to system running 2.5

Post by Andrew Rya » Thu, 05 Jun 2003 23:10:08



> Tough. 'soft' is not a priority of mine. It is a broken feature...

No, it is a broken feature in *LINUX* post 2.4.20pre3, it's not broken in
FreeBSD or Tru64.  Regardless of what Trond says about soft mounts they
should work in Linux just as well as they do in other OSes, such as FreeBSD.

I've tried to debug and I have seen no timeouts.  I believe something is up
with the congestion routines that were added.

Yes, hard mounts work.  But so soft ones.  Linux should not have a
broken NFS.

Andy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

NFS io errors on transfer from system running 2.4 to system running 2.5

Post by Michael Fran » Tue, 17 Jun 2003 03:10:07





> > > I always use hard,intr so that I can manually interrupt* jobs,
> > > but also know that they do not randomly fail just because a few packets
> > > get dropped on my network.  This seems to be the common setup, as far
> > > as I know.

> > Thank you,

> > I will try hard, intr

> no prob.

> Please let the list know if it solves your problem or not - I'm sure
> there are people who want to know if it doesn't, and if it does then the
> solution will be in the archives for the next to find.

> After all, I could be mistaken...  naaahh...   ;)

If have tested mounting nfs partitions mode hard,intr and transfered
kernel bitkeeper repos between systems running combinations of recent
2.4 and 2.5 kernels, and also did bk resync and bk resolve via the network.

It is working dependably and I won't touch soft mounting mode again ...

Regards
Michael

--
Powered by linux-2.5.70-mm3, compiled with gcc-2.95-3 because it's rock solid

My current linux related activities in rough order of priority:
- Testing of Swsusp for 2.4
- Learning 2.5 kernel debugging with kgdb - it's in the -mm tree
- Studying 2.5 serial and ide drivers, ACPI, S3

The 2.5 kernel could use your usage. More info on setting up 2.5 kernel at
http://www.veryComputer.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://www.veryComputer.com/
Please read the FAQ at  http://www.veryComputer.com/

 
 
 

1. solaris 2.5 system mounts an file system on NT server thru NFS

Hi,

    we have crossed mounted a file system onto a solaris 2.5 system from
NT server. On the NT server, the particular directory has given full
access privilliges to everyone. However on the SUN host, other users can
only see the information under that file system after "root" browses the
same directory first. It seems that NFS client maintains a cache
locally, when the request fails to hit in the cache it sends the request
to NT server. The NT server, for some reason, does not return the
information when the request is from users other than root. Anybody has
any idea about how I can properly give the privilige to certain users
other than root?

    Thank you all a lot.

-Yenyue Pai

2. Mounting dosemu hdimage files

3. Will 2.5 for X86 run on non-P&P systems?

4. thoughts...

5. 2.5 apps run under 2.4

6. Linksys- which card in setup

7. *** Can 2.4 bins run on 2.5 ***

8. Lexmark inkjet printer Z42 - beta driver available?

9. Run ipcs on Sparc solaris 2.4 get facility not in system, why ?

10. Compiling on 2.4 & running on 2.5?

11. Compiling & running 2.4 or 2.5 on Compaq Alpha

12. How do you set system limits for shared memory in Solaris 2.4/2.5?

13. System-wide number of open files in 2.4/2.5?