NFS timeouts (Solaris 9 and Solaris 10)

NFS timeouts (Solaris 9 and Solaris 10)

Post by Carsten Beneck » Fri, 11 Nov 2005 19:31:31



Hi,

we have problems with nfs mounts. Large directory listings run into nfs
timeouts while transfer of short information is possible. We have
experienced this problem unsing Solaris 9 and Solaris 10 while Solaris 8
still works fine (as expected).

The server runs AIX 4.3.3.0 (oslevel)

Solaris 8 works fine:

SunOS some-client 5.8 Generic_117350-27 sun4u sparc SUNW,Sun-Blade-100

some-server.rrz.uni-hamburg.de:/local /mnt

drwxr-s---    3 root     sys           512 M?r  1  2001 /mnt/wwwcache

     4205

Solaris 9:

SunOS other-client 5.9 Generic_118558-16 sun4u sparc SUNW,Sun-Fire-V440

some-server.rrz.uni-hamburg.de:/local /mnt

drwxr-s---    3 root     sys           512 Mar  1  2001 /mnt/wwwcache

(no response)
^C

(unmount fails)

Solaris 10:

SunOS other-client 5.10 Generic_118822-19 sun4u sparc SUNW,Sun-Blade-100

some-server.rrz.uni-hamburg.de:/local /mnt

drwxr-s---    3 root     sys           512 Mar  1  2001 /mnt/wwwcache

NFS server some-server.rrz.uni-hamburg.de not responding still trying

(minutes later, still no response)
^C


(unmount succeeds)

Any comments?

Regards
   Carsten

 
 
 

NFS timeouts (Solaris 9 and Solaris 10)

Post by Oscar del Ri » Fri, 11 Nov 2005 23:44:48




> some-server.rrz.uni-hamburg.de:/local /mnt

I don't know if that's the reason but all the "-o" options should
be separated with commas

-o proto=tcp,vers=3

instead of

-o proto=tcp -o vers=3

It seems only the last "-o" option takes effect (ignoring proto=tcp)

For example (not that these are recommended options, just a test):

# mount -r -F nfs -o proto=tcp,hard,forcedirectio srv:/fs
-> remote/read only/setuid/devices/proto=tcp/hard/forcedirectio
(all options recognized)

# mount -r -F nfs -o proto=tcp -o hard -o forcedirectio srv:/fs /mnt
-> remote/read only/setuid/devices/forcedirectio
(tcp and hard options ignored, only forcedirectio took effect)

Bug or feature?

 
 
 

NFS timeouts (Solaris 9 and Solaris 10)

Post by Carsten Beneck » Sat, 12 Nov 2005 00:49:57


Hmm,

you are right, proto=tcp had been ignored, but -o proto=tcp,vers=3  does
not make a difference. I still run into timeouts.

On the other hand, -o proto=udp,vers3 now succeeds on both Sol 9 an Sol
10...

So maybe tcp-transport is the default; but it should work with either
tcp or udp, shouldn't it?

Regards
   Carsten




>> some-server.rrz.uni-hamburg.de:/local /mnt

> I don't know if that's the reason but all the "-o" options should
> be separated with commas

> -o proto=tcp,vers=3

> instead of

> -o proto=tcp -o vers=3

> It seems only the last "-o" option takes effect (ignoring proto=tcp)

> For example (not that these are recommended options, just a test):

> # mount -r -F nfs -o proto=tcp,hard,forcedirectio srv:/fs
> -> remote/read only/setuid/devices/proto=tcp/hard/forcedirectio
> (all options recognized)

> # mount -r -F nfs -o proto=tcp -o hard -o forcedirectio srv:/fs /mnt
> -> remote/read only/setuid/devices/forcedirectio
> (tcp and hard options ignored, only forcedirectio took effect)

> Bug or feature?

 
 
 

NFS timeouts (Solaris 9 and Solaris 10)

Post by Sami Ketol » Sat, 12 Nov 2005 05:38:38



> Hmm,

> you are right, proto=tcp had been ignored, but -o proto=tcp,vers=3  does
> not make a difference. I still run into timeouts.

> On the other hand, -o proto=udp,vers3 now succeeds on both Sol 9 an Sol
> 10...

> So maybe tcp-transport is the default; but it should work with either
> tcp or udp, shouldn't it?

Yes. It's supposed to work with both tcp and udp. Without more data it's still
impossible to say if it's a problem with the server or the client.
please try snooping network traffic while doing ls to get more idea
where it goes wrong.

Sami

--
  .signature: no such file or directory

 
 
 

NFS timeouts (Solaris 9 and Solaris 10)

Post by Carsten Beneck » Sat, 12 Nov 2005 18:35:04


Hi,

here you are...

Sami Ketola wrote:
>>So maybe tcp-transport is the default; but it should work with either
>>tcp or udp, shouldn't it?

> Yes. It's supposed to work with both tcp and udp. Without more data it's still
> impossible to say if it's a problem with the server or the client.
> please try snooping network traffic while doing ls to get more idea
> where it goes wrong.

> Sami

root@some-client:~# uname -a
SunOS some-client 5.9 Generic_118558-16 sun4u sparc SUNW,Sun-Fire-V440
root@some-client:~# /usr/sbin/mount -r -o proto=tcp,vers=3
some-server.rrz.uni-hamburg.de:/local /mnt
root@some-client:~# ls -ld /mnt/wwwcache
drwxr-s---    3 root     sys           512 Mar  1  2001 /mnt/wwwcache
root@some-client:~# ls -l /mnt
^C
(nothings happens)

The mount produced the following flow:

root@some-sol9-client:~# /usr/sbin/snoop "host
some-server.rrz.uni-hamburg.de"
Using device /dev/ce (promiscuous mode)
        some-sol9-client -> some-server.rrz.uni-hamburg.de PORTMAP C
GETPORT prog=100005 (MOUNT) vers=3 proto=UDP
some-server.rrz.uni-hamburg.de -> some-sol9-client        PORTMAP R
GETPORT port=37218
        some-sol9-client -> some-server.rrz.uni-hamburg.de MOUNT3 C Null
some-server.rrz.uni-hamburg.de -> some-sol9-client        MOUNT3 R Null
        some-sol9-client -> some-server.rrz.uni-hamburg.de MOUNT3 C
Mount /local
some-server.rrz.uni-hamburg.de -> some-sol9-client        MOUNT3 R Mount
OK FH=0035 Auth=unix
        some-sol9-client -> some-server.rrz.uni-hamburg.de PORTMAP C
GETPORT prog=100003 (NFS) vers=3 proto=TCP
some-server.rrz.uni-hamburg.de -> some-sol9-client        PORTMAP R
GETPORT port=2049
        some-sol9-client -> some-server.rrz.uni-hamburg.de TCP D=2049
S=42367 Syn Seq=3864554158 Len=0 Win=49640 Options=<mss 1460,nop,nop,sackOK>
some-server.rrz.uni-hamburg.de -> some-sol9-client        TCP D=42367
S=2049 Syn Ack=3864554159 Seq=1637913446 Len=0 Win=59860 Options=<mss 1460>
        some-sol9-client -> some-server.rrz.uni-hamburg.de TCP D=2049
S=42367 Ack=1637913447 Seq=3864554159 Len=0 Win=49640
        some-sol9-client -> some-server.rrz.uni-hamburg.de NFS C NULL3
some-server.rrz.uni-hamburg.de -> some-sol9-client        NFS R NULL3
        some-sol9-client -> some-server.rrz.uni-hamburg.de TCP D=2049
S=42367 Ack=1637913475 Seq=3864554235 Len=0 Win=49640
        some-sol9-client -> some-server.rrz.uni-hamburg.de TCP D=2049
S=42367 Fin Ack=1637913475 Seq=3864554235 Len=0 Win=49640
some-server.rrz.uni-hamburg.de -> some-sol9-client        TCP D=42367
S=2049 Ack=3864554236 Seq=1637913475 Len=0 Win=60032
some-server.rrz.uni-hamburg.de -> some-sol9-client        TCP D=42367
S=2049 Fin Ack=3864554236 Seq=1637913475 Len=0 Win=60032
        some-sol9-client -> some-server.rrz.uni-hamburg.de TCP D=2049
S=42367 Ack=1637913476 Seq=3864554236 Len=0 Win=49640
        some-sol9-client -> some-server.rrz.uni-hamburg.de PORTMAP C
GETPORT prog=100003 (NFS) vers=3 proto=TCP
some-server.rrz.uni-hamburg.de -> some-sol9-client        PORTMAP R
GETPORT port=2049
        some-sol9-client -> some-server.rrz.uni-hamburg.de TCP D=2049
S=42368 Syn Seq=3864677436 Len=0 Win=49640 Options=<mss 1460,nop,nop,sackOK>
some-server.rrz.uni-hamburg.de -> some-sol9-client        TCP D=42368
S=2049 Syn Ack=3864677437 Seq=2736185636 Len=0 Win=59860 Options=<mss 1460>
        some-sol9-client -> some-server.rrz.uni-hamburg.de TCP D=2049
S=42368 Ack=2736185637 Seq=3864677437 Len=0 Win=49640
        some-sol9-client -> some-server.rrz.uni-hamburg.de NFS C NULL3
some-server.rrz.uni-hamburg.de -> some-sol9-client        NFS R NULL3
        some-sol9-client -> some-server.rrz.uni-hamburg.de TCP D=2049
S=42368 Ack=2736185665 Seq=3864677513 Len=0 Win=49640
        some-sol9-client -> some-server.rrz.uni-hamburg.de TCP D=2049
S=42368 Fin Ack=2736185665 Seq=3864677513 Len=0 Win=49640
some-server.rrz.uni-hamburg.de -> some-sol9-client        TCP D=42368
S=2049 Ack=3864677514 Seq=2736185665 Len=0 Win=60032
some-server.rrz.uni-hamburg.de -> some-sol9-client        TCP D=42368
S=2049 Fin Ack=3864677514 Seq=2736185665 Len=0 Win=60032
        some-sol9-client -> some-server.rrz.uni-hamburg.de TCP D=2049
S=42368 Ack=2736185666 Seq=3864677514 Len=0 Win=49640
        some-sol9-client -> some-server.rrz.uni-hamburg.de TCP D=2049
S=1022 Syn Seq=3864844759 Len=0 Win=49640 Options=<mss 1460,nop,nop,sackOK>
some-server.rrz.uni-hamburg.de -> some-sol9-client        TCP D=1022
S=2049 Syn Ack=3864844760 Seq=3611463202 Len=0 Win=59860 Options=<mss 1460>
        some-sol9-client -> some-server.rrz.uni-hamburg.de TCP D=2049
S=1022 Ack=3611463203 Seq=3864844760 Len=0 Win=49640
        some-sol9-client -> some-server.rrz.uni-hamburg.de NFS C FSINFO3
FH=0035
some-server.rrz.uni-hamburg.de -> some-sol9-client        NFS R FSINFO3 OK
        some-sol9-client -> some-server.rrz.uni-hamburg.de TCP D=2049
S=1022 Ack=3611463371 Seq=3864844872 Len=0 Win=49640

The ls -ld /mnt/wwwcache produced the following communication:

        some-sol9-client -> some-server.rrz.uni-hamburg.de NFS C LOOKUP3
FH=0035 wwwcache
some-server.rrz.uni-hamburg.de -> some-sol9-client        NFS R LOOKUP3
OK FH=52E1
        some-sol9-client -> some-server.rrz.uni-hamburg.de TCP D=2049
S=1022 Ack=3611463615 Seq=3864844996 Len=0 Win=49640
        some-sol9-client -> some-server.rrz.uni-hamburg.de NFS C ACCESS3
FH=0035 (read,lookup,modify,extend,delete)
some-server.rrz.uni-hamburg.de -> some-sol9-client        NFS R ACCESS3
OK (read,lookup)
        some-sol9-client -> some-server.rrz.uni-hamburg.de NFS_ACL C
GETACL3 FH=52E1 mask=10
some-server.rrz.uni-hamburg.de -> some-sol9-client        RPC R (#42)
XID=1744375558 Program unavailable
        some-sol9-client -> some-server.rrz.uni-hamburg.de TCP D=2049
S=1022 Ack=3611463767 Seq=3864845228 Len=0 Win=49640

And finally the trial for the directory listing (ls -l /mnt):

        some-sol9-client -> some-server.rrz.uni-hamburg.de NFS C
READDIRPLUS3 FH=0035 Cookie=0 for 8192/32768
some-server.rrz.uni-hamburg.de -> some-sol9-client        NFS R
READDIRPLUS3 OK 9+ entries (incomplete)
some-server.rrz.uni-hamburg.de -> some-sol9-client        TCP D=1022
S=2049 Ack=3864845364 Seq=3611465227 Len=1460 Win=60032
        some-sol9-client -> some-server.rrz.uni-hamburg.de TCP D=2049
S=1022 Ack=3611466687 Seq=3864845364 Len=0 Win=46720
some-server.rrz.uni-hamburg.de -> some-sol9-client        TCP D=1022
S=2049 Ack=3864845364 Seq=3611466687 Len=1460 Win=60032
some-server.rrz.uni-hamburg.de -> some-sol9-client        TCP D=1022
S=2049 Ack=3864845364 Seq=3611468147 Len=1460 Win=60032
        some-sol9-client -> some-server.rrz.uni-hamburg.de TCP D=2049
S=1022 Ack=3611469607 Seq=3864845364 Len=0 Win=43800
some-server.rrz.uni-hamburg.de -> some-sol9-client        TCP D=1022
S=2049 Ack=3864845364 Seq=3611469607 Len=1460 Win=60032
some-server.rrz.uni-hamburg.de -> some-sol9-client        TCP D=1022
S=2049 Ack=3864845364 Seq=3611471067 Len=1460 Win=60032
        some-sol9-client -> some-server.rrz.uni-hamburg.de TCP D=2049
S=1022 Ack=3611472527 Seq=3864845364 Len=0 Win=40880
some-server.rrz.uni-hamburg.de -> some-sol9-client        RPC R
XID=243597312
some-server.rrz.uni-hamburg.de -> some-sol9-client        TCP D=1022
S=2049 Ack=3864845364 Seq=3611473987 Len=1460 Win=60032
        some-sol9-client -> some-server.rrz.uni-hamburg.de TCP D=2049
S=1022 Ack=3611475447 Seq=3864845364 Len=0 Win=37960
some-server.rrz.uni-hamburg.de -> some-sol9-client        TCP D=1022
S=2049 Ack=3864845364 Seq=3611475447 Len=1460 Win=60032
some-server.rrz.uni-hamburg.de -> some-sol9-client        TCP D=1022
S=2049 Ack=3864845364 Seq=3611476907 Len=1460 Win=60032
        some-sol9-client -> some-server.rrz.uni-hamburg.de TCP D=2049
S=1022 Ack=3611478367 Seq=3864845364 Len=0 Win=35040
some-server.rrz.uni-hamburg.de -> some-sol9-client        TCP D=1022
S=2049 Ack=3864845364 Seq=3611478367 Len=1460 Win=60032
some-server.rrz.uni-hamburg.de -> some-sol9-client        TCP D=1022
S=2049 Ack=3864845364 Seq=3611479827 Len=1460 Win=60032
        some-sol9-client -> some-server.rrz.uni-hamburg.de TCP D=2049
S=1022 Ack=3611481287 Seq=3864845364 Len=0 Win=32120
some-server.rrz.uni-hamburg.de -> some-sol9-client        TCP D=1022
S=2049 Ack=3864845364 Seq=3611481287 Len=1460 Win=60032
some-server.rrz.uni-hamburg.de -> some-sol9-client        TCP D=1022
S=2049 Ack=3864845364 Seq=3611482747 Len=1460 Win=60032
        some-sol9-client -> some-server.rrz.uni-hamburg.de TCP D=2049
S=1022 Ack=3611484207 Seq=3864845364 Len=0 Win=29200
some-server.rrz.uni-hamburg.de -> some-sol9-client        RPC R
XID=243597312
some-server.rrz.uni-hamburg.de -> some-sol9-client        TCP D=1022
S=2049 Ack=3864845364 Seq=3611485667 Len=1460 Win=60032
        some-sol9-client -> some-server.rrz.uni-hamburg.de TCP D=2049
S=1022 Ack=3611487127 Seq=3864845364 Len=0 Win=26280
some-server.rrz.uni-hamburg.de -> some-sol9-client        TCP D=1022
S=2049 Ack=3864845364 Seq=3611487127 Len=1460 Win=60032
some-server.rrz.uni-hamburg.de -> some-sol9-client        TCP D=1022
S=2049 Ack=3864845364 Seq=3611488587 Len=1460 Win=60032
        some-sol9-client -> some-server.rrz.uni-hamburg.de TCP D=2049
S=1022 Ack=3611490047 Seq=3864845364 Len=0 Win=48180
some-server.rrz.uni-hamburg.de -> some-sol9-client        TCP D=1022
S=2049 Ack=3864845364 Seq=3611490047 Len=1460 Win=60032
some-server.rrz.uni-hamburg.de -> some-sol9-client        TCP D=1022
S=2049 Ack=3864845364 Seq=3611491507 Len=1460 Win=60032
        some-sol9-client -> some-server.rrz.uni-hamburg.de TCP D=2049
S=1022 Ack=3611492967 Seq=3864845364 Len=0 Win=45260
some-server.rrz.uni-hamburg.de -> some-sol9-client        TCP D=1022
S=2049 Ack=3864845364 Seq=3611492967 Len=1460 Win=60032
some-server.rrz.uni-hamburg.de -> some-sol9-client        TCP D=1022
S=2049 Ack=3864845364 Seq=3611494427 Len=1460 Win=60032
        some-sol9-client -> some-server.rrz.uni-hamburg.de TCP D=2049
S=1022 ...

read more »

 
 
 

1. Jumpstart solaris 10 b69 and solaris 10 b72 on an Ultra 30

I am trying to jumpstart a Sun Ultra 30 using Solaris 10 b69
(b72 gives the same results) from a Linux jumpstart server.

I have scripts for all this (soon to be release opensource!) but
..although I can get b55, beta5(b60) and b63  and b55 to work
b63 and b72 do not work!

since this is an Ultra 30 I went to

http://sunsolve.sun.com/handbook_pub/Devices/Boot_PROM/BootPROM_Sun4u...
and ran Patch 105930-06 with no problems.

Even now it does not boot.

- bpgetfile shows all parameters can be retreived ok.
- nfs mounts all work
- pfinstall shows the profile is ok
- I've checked the rules file and it is fine.
- The /tftpboot directory is setup correctly with the right inetboot file.

reset
boot net - install
I can see the tftp transfer clock up ok to 38600
The linux server shows the correct mount of the root= directory.

Normally the kernel banner should appear next but instead I get
loads of messages

not found: vmem_create
not found: vmem_alloc
not found: cv_signal
not found: wake_sched_sec
not found: physmax
not found: putreg
not found: fp_precise
not found: ddi_prop_lookup_init_array
not found: sema_v
not found: traceregs
not found: psignal
not found: strcmp
not found: fop_putpage
not found: strcmp
not found: global_zone
not found: fastscan

etc.. and eventually

krtld:error during initial load/link phase
panic - boot:exitto64 returned from client program

and I'm back at the ok prompt

1. From the book "Solaris Internals: Core Kernel Architecture" using

/usr/ccs/bin/dump -Lv /platform/sun4u/kernel/sparcv9/unix

mentions the line NEEDED..dtracestubs

but I cannot find dtracestubs under that /platform directory.
Is this a problem?

2. nm on things seems to indicate the missing symbols are in genunix
but I'm not sure how to debug what is going on??

3. eeprom | grep boot-file give
boot-file: data not available so I assume a 64-bit boot should occur.

On the net I've seen add set moddebug to /etc/system but the
Solaris Internals boot indicates that /etc/system is read AFTER
krtld does an inital link.

I'm totally stumped why this will not work...Any ideas?

2. LI, what is that

3. Live upgrade failure: Solaris 10 -> Solaris 10 update 1.

4. Lilo Installing

5. lsof 4.70D for Solaris 10 [was Re: Lsof on Solaris 10 x86?]

6. Execute modprobe during boot time with startup script? Please help! Red Hat 8

7. Solaris 10 -- Where to get gcc (or other compiler) built on Solaris 10?

8. HELP AUTOMOUNT

9. Solaris 10 (build 63) upgrade fails on Solaris 8 as well as Solaris 9 sparcs

10. java calling poll() with 10 ms timeout on solaris 2.6

11. Solaris 10 NFS 4 Problem

12. Solaris Express (Solaris 10 snapshots) for SunBlade 2500?

13. Solaris 8 Application not working on Solaris 10