I have found what I believe to be a bug in the Solaris 8 NFS client.
After an upgrade from Solaris 5.6, one of our programs began failing.
I have been able to reproduce the failure with the following script:
#!/bin/sh
HOST=`hostname`
FILE=foobar
#
# Remove any old files
#
rm -f $FILE # unlink("$FILE")
rm -f $FILE.busy # unlink("$FILE.busy")
#
# This is part of an exclusive open routine.
# The "ln" should only succeed if "$FILE.busy" does not already exist.
#
rm -f $HOST.$$.xopn # unlink("$HOST.$$.xopn")
exec 4>$HOST.$$.xopn # open("$HOST.$$.xopn", ...)
ln $HOST.$$.xopn $FILE.busy # link("$HOST.$$.xopn", "$FILE.busy")
rm -f $HOST.$$.xopn # unlink("$HOST.$$.xopn")
#
# Doing the move before closing the file descriptor fails.
# Doing the move after closing the file descriptor works.
#
date >&4
mv $FILE.busy $FILE # before close fails, after close works
exec 4>&- # close(4)
The result should be the file "foobar" with the output of the "date"
command. When run on a local file-system or under Solaris 5.6, this
is the result. When run under Solaris 8 on an NFS mounted file-system
the file "foobar" does not exist, however a file named ".nfsXXXX" does
exist and contains the output of the "date" command. Note that if the
last two lines of the script are swapped, then the result is correct.
Also, if the "mv" command is omitted then the file "foobar.busy" is
created correctly in all cases.
I have tried both Solaris (5.6 and 8) and Linux NFS clients and both
Solaris (5.6) and Linux NFS servers. The failure only occurs for the
Solaris 8 NFS client, and regardless of the type of NFS server.
Has anyone seen this before? Is this a known bug (patch available)?
--
+----------------------------------+----------------------------------+
| Daniel K. Forrest | Laboratory for Molecular and |
+----------------------------------+----------------------------------+