Frequency of NFS lock problems w/ NFS mounted mail spool

Frequency of NFS lock problems w/ NFS mounted mail spool

Post by Dan Moseda » Sat, 29 Aug 1992 03:59:38



For several years, I have run small networks with /var/spool/mail
NFS-mounted from a single server to a bunch of workstations.  I have
yet to have a problem doing this.  Most of these machines have been
Sun's running a recent version of SunOS.

However, as I understand it, due to various bugs in various vendors'
rpc.lockds, this is running the risk of getting one's mailbox
thoroughly hosed if more than one process tries to write the mailbox
at once.

I am now working with a network of Sun SPARC's running SunOS 4.1.2 as
well a few DECstations runing Ultrix 4.2a.  We are using IDA Sendmail
5.65c.

I'd be curious to know how many folks have actually had this kind of
problem -- if it turns out to be very few, NFS seems like it would be
worth the convenience.  Specific experience about the reliability of
the SunOS and Ultrix lockd's would also be interesting.

Also, where do the Elm lockfiles fit into all this?  Would they
prevent this problem (if everyone were to use Elm, that is), or do
they merely keep multiple copies of Elm from manipulating the same
mailbox simultaneously.

All discussion, thoughts, and email are welcome...

Thanks in advance,
Dan Mosedale


 
 
 

Frequency of NFS lock problems w/ NFS mounted mail spool

Post by Per Hedela » Sun, 30 Aug 1992 00:39:49



Quote:>However, as I understand it, due to various bugs in various vendors'
>rpc.lockds, this is running the risk of getting one's mailbox
>thoroughly hosed if more than one process tries to write the mailbox
>at once.

How about, "due to various bugs in various vendors' rpc.lockds, said
vendors don't even trust them enough to use them in their own software"?
- As far as I know, neither /bin/mail nor /usr/ucb/mail in SunOS do any
lockf() etc calls, but rather rely on "traditional" Unix locking,
checking for a /var/spool/mail/user.lock file. (I verified this for
/usr/ucb/mail by using trace - it did use flock(), though...)

Of course the lockfile method doesn't really work reliably over NFS
either (but probably better than rpc.lockd:-) - nevertheless, for what
it's worth, I have never heard of anyone getting bitten by this, and
NFS-mounting /var/spool/mail is just about the norm around here (as is
SunOS) - of course the probability of clashes depends on how much mail
you're getting.

The risk is presumably further reduced by keeping time well in sync
between server and client (we use NTP), and by mounting /var/spool/mail
with the 'noac' option. And of course *delivery* of mail should be on
the server only.

--Per Hedeland


...uunet!erix.ericsson.se!per

 
 
 

Frequency of NFS lock problems w/ NFS mounted mail spool

Post by Linda Flor » Wed, 02 Sep 1992 21:55:45


Karl> It is decidedly NOT a good idea to have more than one machine on an
Karl> NFS-mounted mail spool writing >inbound< messages to a user's mailbox.  
Karl> The various rpc.lockd bugs will bite you if you try this, and Sun has one of
Karl> the worst records in this area.  The results are hung sendmail processes and
Karl> mail that never gets delivered.

We have a network of more than a hundred Sun4, which all mount the
mail-directory read-write and do local mail-delivery without
forwarding to the server ( it would most likely break down if they
did), and we never had serious problems  with this policy.

Karl> Then again, this is not a difficult problem to solve -- just have all hosts
Karl> forward mail for local delivery to the host which actually has the disk
Karl> attached to it, and do the mailbox delivery there.  This is a safe and sane
Karl> approach.  It also permits simple "domain hiding" if you wish to do it.

Good Bye

Linda Floren

 
 
 

Frequency of NFS lock problems w/ NFS mounted mail spool

Post by Jim Au » Thu, 03 Sep 1992 05:01:09



   Karl> It is decidedly NOT a good idea to have more than one machine on an
   Karl> NFS-mounted mail spool writing >inbound< messages to a user's mailbox.  
   Karl> The various rpc.lockd bugs will bite you if you try this, and Sun has one of
   Karl> the worst records in this area.  The results are hung sendmail processes and
   Karl> mail that never gets delivered.

   We have a network of more than a hundred Sun4, which all mount the
   mail-directory read-write and do local mail-delivery without
   forwarding to the server ( it would most likely break down if they
   did), and we never had serious problems  with this policy.

We have over 350 client workstations (half Suns, half IBMs) mounting a
single mail directory (from a Sun) read-write, and all of them forward
mail to the server for local delivery.  This machine is also used as a
forwarding machine for mail that goes off campus.  It has not broken
down yet (but we have been seeing some high loads lately).

We have definitely seen locking problems between IBM RS6000 NFS
clients and our Sun4 NFS server.  We have received patched rpc.lockd,
and a new kernel with the NFS jumbo patch, but still we have been
having problems with a user on an IBM NFS client typing "q" to get out
of mail, and the process never returns.  Killing and restarting
rpc.lockd sometimes works to free up these processes that are waiting
for locks from the server, but something only a fastboot of the server
will fix it.  It had been working fine for several months (the "q" bug
had disappeared), so we are not sure why it has come up again now.

I will say that we have doing this for several years (since 1987) with a
network of Suns, and we never had this problem until we added IBM
RS6000 NFS clients into the mix.  It still has been a relatively
stable mail architecture for quite a few machines.

Now, however, in the face of such high loads, and the number of client
workstations increasing, we are working on moving to a solution based
on POP, which we hope will allow us to split the load across two or
more server machines (you can't split one directory across two NFS
servers easily).

If you want more info, send me email.
--

 
 
 

Frequency of NFS lock problems w/ NFS mounted mail spool

Post by Marc W » Thu, 03 Sep 1992 22:41:37



Quote:>As I recall (it has been about a year since I was through this) Elm uses flock()not lockf(), so NFS locking isn't a problem (nor possible for that matter since
>flock() won't work on an NFS file), what does the locking is the .lock file that

This is true on SunOS but not AIX v3.  AIX v3 does support flock over NFS.

Marc

--
Marc Wiz                        Yes that really is my last name.
MaBell:                         (512)244-8780

The views expressed are my own.  Mine all mine!

 
 
 

Frequency of NFS lock problems w/ NFS mounted mail spool

Post by Stan Jan » Sun, 06 Sep 1992 03:40:34



>We have over 350 client workstations (half Suns, half IBMs) mounting a
>single mail directory (from a Sun) read-write, and all of them forward
>mail to the server for local delivery.  This machine is also used as a
>forwarding machine for mail that goes off campus.  It has not broken
>down yet (but we have been seeing some high loads lately).

Is there some reason you want your clients forwarding outgoing mail to
the server instead of transmitting it off campus themselves? As long as

go to the server, and the server will be much less of a bottleneck.

You'd probably need a way to update all the non-diskless clients'
/etc/sendmail.cf files from time to time, but with so many workstations,
I bet you've already had to find a mechanism to do that sort of thing.

  -- Stan Janet

 
 
 

1. More mail file locking questions (lockf, NFS, /var/spool/mail/*.lock)

        Here's a problem I've run into:

        The problem I've seen relates to building mh-6.8 for SVR4
        (and correspondingly LOCKF). Now, it appears that while
        INC believes the mail file (NFS mounted) should be
        locked, it isn't (or it is lockf'ed and someone else
        [/bin/mail?] isn't honoring the lock; it says something
        like "New mail has arrived.."). It appears that /bin/mail
        (on 4.1.x and Solaris 2.x) uses lock files
        (/var/spool/mail/*.lock).  

        My question is, should mh ({l}emacs, popper,imap,...)
        use this style (and abandon lockf), or is there some
        other appropriate solution to this problem?

        Thanks,

        Dave    

        David M. Meyer                  Voice:     503/346-1747
        Senior Network Engineer         Pager:     503/342-9458
        Office of University Computing  FAX:       503/346-4397

        University of Oregon
        1225 Kincaid
        Eugene, OR 97403        

2. restarting shell scripts

3. SUMMARY: NFS mounting /var/spool/mail && file locking

4. 24bpp @ 1280x1024 4MB Stealth 64VRAM (968), Xfree 3.2

5. Problems with 2.1 kernel and NFS mounted mail spool

6. udping to 143.132.8.4?

7. nfs mounted /var/spool/mail causing problems

8. Password problem with Caldera Lite

9. /var/mail: to NFS mount or not to NFS mount?

10. NFS-mounted mail spool directory

11. NFS mounting mail spool dir: Pros/Cons?

12. NFS /var/spool/mail mounts

13. NFS mounted mail spool direcotry