kernel file/record locking problem

kernel file/record locking problem

Post by Danie » Fri, 03 Oct 1997 04:00:00



We had/have a problem with file/record locking.

Scenario:  nfs-server exports /var/mail

           nfs-client1 mounts /var/mail with these options:
              proto=tcp,vers=3,rw,intr,quota,bg,actimeo=0,nosuid

           nfs-client2 mounts /var/mail identically to nfs-client1

           All three systems are running SunOS 5.5.1.

The problem first appeared with users of popmail on nfs-client1.  Users
would get the following message:

/var/mail/.pop/.joe-user.pop': No record locks available (46)
/var/mail/.pop/.joe-user.pop': No record locks available (46)

Sendmail reported similar problems:

451 cannot lockf(/export/home/joe-user/dead.letter, fd=4, type=2,
omode=37777777777, euid=3731): No record locks available

We eventually traced this problem to a single file in /var/mail which
could not be locked.  A procmail process was attempting to deliver to
this mailbox.  When we killed the process things returned to normal
until something tried to lock that particular file again.  We moved this
file and copied its contents back to the original name and then
everything worked okay.   A truss of the procmail process showed that it
was* like this:

5363:   fcntl(3, F_SETLKW, 0xEFFFFC14)  (sleeping...)

An "lslk" of that file shows:

# lslk /var/mail/weird-
SRC              PID       DEV INUM    SZ TY M ST WH END LEN NAME
nfs-client2     9135 110,98000 7837 62417  r 0  0  0   0   0 /var/mail/weird-

However, process 9135 does not exist on nfs-client2 so it's a stale
lock.  

This leads us to two questions:

(1) How do we clear that stale lock?  (Without rebooting nfs-server) and
(2) why did this one lock (apparently) break the entire file/record
locking system on nfs-client[12]?  

Thanks for any information you can provide.
--
Daniel Craigmile                  |  Texas A&M University
http://www.veryComputer.com/~danielc/    |  Computing & Information Services

(409) 845-6905 (voice)            |  UNIX Group

 
 
 

kernel file/record locking problem

Post by Tim Goodw » Fri, 03 Oct 1997 04:00:00



>Scenario:  nfs-server exports /var/mail

Don't Do That.

Tim.
--
Tim Goodwin | "Gateways are designed for the purpose of losing information;
Cygnus, UK  | some do better than others." -- Dave Crocker

 
 
 

kernel file/record locking problem

Post by Lars Balker Rasmusse » Fri, 03 Oct 1997 04:00:00




> >Scenario:  nfs-server exports /var/mail

> Don't Do That.

That's about as religious as arguing about how many and how large
partitions you should use.  We're happilly exporting our central
mail-spool to all our workstations (all Solaris 2.5.1), with actimeo=0
on mount.

The local comp-sci department has a central mail-spool exported to a lot
of machines running different OS'es, and I haven't heard of any problems
with that setup either.

Why is it a Problem?
--
Lars Balker Rasmussen, Software Engineer, Mjolner Informatics ApS

 
 
 

kernel file/record locking problem

Post by Daniel E Whick » Fri, 03 Oct 1997 04:00:00





: > >Scenario:  nfs-server exports /var/mail
: >
: > Don't Do That.

: That's about as religious as arguing about how many and how large
: partitions you should use.  We're happilly exporting our central
: mail-spool to all our workstations (all Solaris 2.5.1), with actimeo=0
: on mount.

: The local comp-sci department has a central mail-spool exported to a lot
: of machines running different OS'es, and I haven't heard of any problems
: with that setup either.

: Why is it a Problem?

Security would be what I'd be concerned about.  NFS isn't quite what I'd
call overly secure.

 
 
 

kernel file/record locking problem

Post by Somkit Khemmanivan » Sat, 04 Oct 1997 04:00:00


Hi,


> We had/have a problem with file/record locking.

> Scenario:  nfs-server exports /var/mail

>            nfs-client1 mounts /var/mail with these options:
>               proto=tcp,vers=3,rw,intr,quota,bg,actimeo=0,nosuid

>            nfs-client2 mounts /var/mail identically to nfs-client1

>            All three systems are running SunOS 5.5.1.

> The problem first appeared with users of popmail on nfs-client1.  Users
> would get the following message:

> /var/mail/.pop/.joe-user.pop': No record locks available (46)
> /var/mail/.pop/.joe-user.pop': No record locks available (46)

> Sendmail reported similar problems:

> 451 cannot lockf(/export/home/joe-user/dead.letter, fd=4, type=2,
> omode=37777777777, euid=3731): No record locks available

> We eventually traced this problem to a single file in /var/mail which
> could not be locked.  A procmail process was attempting to deliver to
> this mailbox.  When we killed the process things returned to normal
> until something tried to lock that particular file again.  We moved this
> file and copied its contents back to the original name and then
> everything worked okay.   A truss of the procmail process showed that it
> was* like this:

Maybe a buggy app?

Quote:

> 5363:   fcntl(3, F_SETLKW, 0xEFFFFC14)  (sleeping...)

> An "lslk" of that file shows:

> # lslk /var/mail/weird-
> SRC              PID       DEV INUM    SZ TY M ST WH END LEN NAME
> nfs-client2     9135 110,98000 7837 62417  r 0  0  0   0   0 /var/mail/weird-

> However, process 9135 does not exist on nfs-client2 so it's a stale
> lock.

> This leads us to two questions:

> (1) How do we clear that stale lock?  (Without rebooting nfs-server) and

Staleness can be handled by re-mounting the NFS client. If you can't
umount
because of the lock, try unlinking the locked file and/or lsof and kill
the process holding the file open.

Quote:> (2) why did this one lock (apparently) break the entire file/record
> locking system on nfs-client[12]?

Here're two things two check:

1) sar -v to look at your file lock system table. Is it full?

2)Make sure NFS locking daemons are present. You may also want
to check for any relevant OS patches.

> Thanks for any information you can provide.
> --
> Daniel Craigmile                  |  Texas A&M University
> http://www.veryComputer.com/~danielc/    |  Computing & Information Services

> (409) 845-6905 (voice)            |  UNIX Group

--
____________________    
 __   __   __  
|__) (__ (-__

Somckit Khemmanivanh
Distributed Systems Interconnect

 
 
 

kernel file/record locking problem

Post by Tim Goodw » Sat, 04 Oct 1997 04:00:00




Quote:>That's about as religious as arguing about how many and how large
>partitions you should use.

My only "religion" is reliability.

Quote:>Why is it a Problem?


discussion in comp.sys.sun.admin.

Tim.
--
Tim Goodwin | "Gateways are designed for the purpose of losing information;
Cygnus, UK  | some do better than others." -- Dave Crocker

 
 
 

kernel file/record locking problem

Post by Dani » Sat, 04 Oct 1997 04:00:00



>> We eventually traced this problem to a single file in /var/mail which
>> could not be locked.  A procmail process was attempting to deliver to
>> this mailbox.  When we killed the process things returned to normal
>> until something tried to lock that particular file again.  We moved this
>> file and copied its contents back to the original name and then
>> everything worked okay.   A truss of the procmail process showed that it
>> was* like this:

>Maybe a buggy app?

      Not likely.  After we killed the procmail process we wrote our own
      short program and got the same results.  It's definitely because
      of the *file*. I'll include the C code at the end of the message.
Quote:

>Staleness can be handled by re-mounting the NFS client. If you can't
>umount
>because of the lock, try unlinking the locked file and/or lsof and kill
>the process holding the file open.

      That didn't work either. We've unmounted that filesystem on
      nfs-client1 several times.  We've even rebooted it.  Still, when
      we try to open a lock on that *file* we get the same results.
      Note that the stale lock originated on the other NFS client -
      nfs-client2 and that nfs-server is apparently keeping that lock
      around.

Quote:>Here're two things two check:

>1) sar -v to look at your file lock system table. Is it full?

      sar -v reports:

[...]
16:00:00  311/16346    0 29838/29838    0 1610/1610    0    0/0

Quote:

>2)Make sure NFS locking daemons are present. You may also want
>to check for any relevant OS patches.

      statd and lockd are both running, and have even be restarted...
      still no effect.

      This is the program mentioned above.

#include <unistd.h>
#include <stdio.h>
#include <fcntl.h>
#include <sys/stat.h>

int
main(int argc, char **argv) {
  int i,file;
  char *fname;
  struct flock fl;

  for(i=1;i<argc;i++) {

    if ( stat( argv[i], NULL ) == 0 ) {
      printf("No such file: %s\n",argv[i]);
      continue;
    }
    fname=argv[i];

    file = open(fname,O_WRONLY|O_CREAT|O_APPEND);

    if (file<0) {
      printf("Problem opening file: %s\n",fname);
      continue;
    } else {
      printf("Opened: %s\n",fname);
    }

    fl.l_type=F_WRLCK;fl.l_whence=SEEK_SET;fl.l_len=0;
    if ( fcntl(file,F_SETLKW,&fl) >= 0 ) {
      printf("Locked.\n");
    } else {
      printf("Failed to lock.\n");
    }

    fl.l_type=F_UNLCK;
    if ( fcntl(file,F_SETLKW,&fl) >= 0 ) {
      printf("Unlocked\n");
    } else {
      printf("Failed to un-lock\n");
    }

    close(file);

  }
  exit(0);

Quote:}

--
Daniel Craigmile                  |  Texas A&M University
http://www.veryComputer.com/~danielc/    |  Computing & Information Services

(409) 845-6905 (voice)            |  UNIX Group
 
 
 

kernel file/record locking problem

Post by mi.. » Sun, 05 Oct 1997 04:00:00



>> An "lslk" of that file shows:

>> # lslk /var/mail/weird-
>> SRC              PID       DEV INUM    SZ TY M ST WH END LEN NAME
>> nfs-client2     9135 110,98000 7837 62417  r 0  0  0   0   0 /var/mail/weird-

>> However, process 9135 does not exist on nfs-client2 so it's a stale
>> lock.

We regulary see such bogus locks.  `lslk' on the server will tell you
which client originated it.  Killing and restarting /usr/lib/nfs/lockd
on the client will make the lock go away.  It's apparently caused by a
bug that leaves the lock in effect after the process exits.

--
-Gary Mills-    -Unix Support-    -U of M Academic Computing and Networking-

 
 
 

kernel file/record locking problem

Post by Paul Egge » Sun, 05 Oct 1997 04:00:00



>We regularly see such bogus locks.

To fix that problem, try installing a Solaris that doesn't have Sun bug
1182705 (``Signals may orphan locks on clients'').  Either upgrade to
Solaris 2.6, or install the relevant OS patch.  E.g. for Solaris 2.5.1,
install Sun patch 103640-08 or later.  While you're at it, you should
install all the other Sun-recommended patches.
 
 
 

kernel file/record locking problem

Post by Vic Abe » Sun, 05 Oct 1997 04:00:00




>>> An "lslk" of that file shows:

>>> # lslk /var/mail/weird-
>>> SRC              PID       DEV INUM    SZ TY M ST WH END LEN NAME
>>> nfs-client2     9135 110,98000 7837 62417  r 0  0  0   0   0 /var/mail/weird-

>>> However, process 9135 does not exist on nfs-client2 so it's a stale
>>> lock.
>We regulary see such bogus locks.  `lslk' on the server will tell you
>which client originated it.  Killing and restarting /usr/lib/nfs/lockd
>on the client will make the lock go away.  It's apparently caused by a
>bug that leaves the lock in effect after the process exits.

Right.  Lslk reports what it finds in the server's lock table.  If
the necessary client to server interaction to update the server's
lock table never takes place, then lslk will report locks for which
there is no client process at the specified PID.  To clean up the
server's lock table it may even be necesary to restart the lock
daemons on both the client and the server.


 
 
 

1. file- or record-locking with linux?

Hi,

some time ago there was a discussion about connecting DOS-PC`s
to a linux fileserver an which way is the better one (Samba vs.
NFS).

I's working with a linux server and some DOS-PC's connected with
the very good XFS package of Robert Juhasz (XFS 1.91). But if I
open a file, it will never be locked like in a Novell Netware
System. Therefore I can reopen it from another machine, do some
changes and overwrite the file which has currently been changed
by the first PC.

Does anybody know how to solve this problem? Is there a lockd
available for Linux?

Thanks a lot in advance,

Wolfgang Holzinger

Tel/Fax: +49 821 715207

2. linuxpmac on CD?

3. Locking records/files by pcnfsd

4. dansguardian + viruscan, dont scan anything !!

5. Who has a record locked in a file

6. NCSA httpd 1.5a

7. Writing to segments of files, Record Locks

8. Truedox 3Button W/ XFree

9. Using shell script can I implement record locking on file ?

10. Help: out of record file locks

11. File and record locks

12. DOSEMU file/record locking?

13. Does Advance File & Print Server offer Record Locking