"find" in root crontab hangs Solaris 2.2

"find" in root crontab hangs Solaris 2.2

Post by Larry Bro » Thu, 18 Nov 1993 03:25:33



Since installing Solaris 2.2 on an IPC which formerly ran 1.1, it has
been* nightly.   The system responds to pings, but not much
else.

After days of testing, putting periodic writes to syslog in the
crontab, etc, I found that the following line, included in the default
root crontab, is the culprit:

15 3 * * * find / -name .nfs\* -mtime +7 -exec rm -f {} \;
-o -fstype nfs -prune"

This line which causes no problems on the Solaris 1.1 systems, brings
the 2.2 system to its knees.  The disk can be heard clicking away even
after the system has hung, implying that the find never finishes.

Any ideas?  Familiar problem?  Known bug?

This system is being used at the moment solely as a test platform for
moving to Solaris 2.2, so I can take it down or run tests on it as
needed.

Extra info that may be useful: Memory is 24 Megs.  The system disk is
only 200MB, so I'm mounting most of my diskspace from a SunOS 4.3
system.  I keep the Answerbook CD mounted all the time.

 
 
 

"find" in root crontab hangs Solaris 2.2

Post by Bob Dowli » Thu, 18 Nov 1993 08:28:49


|> Since installing Solaris 2.2 on an IPC which formerly ran 1.1, it has
|> been* nightly.   The system responds to pings, but not much
|> else.

Do you have a CD inserted?  If you do then this is bug 1133445, whose solution
(from Sun) I post below.  Sun have taken option 3 of the set below and used it
in Solaris 2.3 which perhaps your experimental system should be running now.
--------
Bob Dowling:                    UNIX Support,
                                University of Cambridge Computing Service,

+44 223 334728                  Cambridge, UK.  CB2 3QG.

 --- Response from Sun when I asked.  Pick one of the three. ---

1. umount the cdrom every night (not really suitable!)

2. Change the find command to be as follows:

find / -name .nfs\* -mtime +7 -exec rm -f {} \; -o -fstype nfs -prune -fstype hsfs -prune

3. Follow the directions in SRDB 5853 (included below)

_______________________________________________________________________________

SRDB ID            : 5853

SYNOPSIS           : root's find in crontab hangs system overnight

DETAIL DESCRIPTION :

The following line appears in /var/spool/cron/crontabs/root:

15 3 * * * find / -name .nfs\* -mtime +7 -exec rm -f {} \; -o -fstype nfs -prun
e
This runs a find command daily at 3:15 am  that looks for .nfs* files
and removes them if they're more than a week old.  These files are created
by client renames if a remove is attempted on an open file.

If an hsfs cdrom is mounted on the system, this job will cause the machine
to freeze soon after it starts at 3:15 AM. The result will be a hung
machine when people attempt to login.

SOLUTION SUMMARY   :

This find is unnecessarily wasteful of system resources for several reasons:

o It does not distinguish between servers and non-servers.  The find
  is wasted on non-servers.

o It searches filesystems that are not exported.

o The "-fstype nfs" primitive forces the find command to use the statvfs()
  call on every file.  Normally just lstat is called.

o The -prune facility does not work for autofs mounts.  The find touches
  direct autofs mountpoints and forces their mounts.

o It is run daily, even though it removes only those .nfsxxx files that
  are a week old.

This process could be vastly improved with the addition of a simple
wrapper script around the find (e.g., /usr/lib/fs/nfs/nfs_find.sh)

        #!/bin/sh

        # Check shared NFS filesystems for .nfs* files that
        # are more than a week old.
        #
        # These files are created by NFS clients when an open file
        # is removed. To preserve some semblance of Unix semantics
        # the client renames the file to a unique name so that the
        # file appears to have been removed from the directory, but
        # is still usable by the process that has the file open.

        if [ ! -f /etc/dfs/sharetab ]; then exit ; fi

        for dir in `awk '$3 == "nfs" {print $1}' /etc/dfs/sharetab`
        do
                find $dir -name .nfs\* -mtime +7 -mount -exec rm -f {} \;
        done

This script invokes the find command only on exported filesystems.
Machines that do not export filesystems (non-servers) will not run
the find command at all.

Without the -fstype primitive the find runs 25% faster.  The modified
find also does not force direct autofs mounts to be mounted.

To install:
become root
save the script as /usr/lib/fs/nfs/nfs_find.sh
chmod +x /usr/lib/fs/nfs/nfs_find.sh

EDITOR=/usr/ucb/vi;export EDITOR
crontab -e

Find the line in the crontab file that is running find at 3:15 am.

Change it to:

15 3 * * * /usr/lib/fs/nfs/nfs_find.sh >/dev/null 2>&1

This should do it!

Script and solution taken from Bugid 1113177.

PRODUCT            : System_Crash

SUNOS RELEASE      : 2.2

UNBUNDLED RELEASE  : n/a

HARDWARE RELEASE   : Sun4

ISO-9001 STATUS    : Uncontrolled
_______________________________________________________________________________

 
 
 

"find" in root crontab hangs Solaris 2.2

Post by Jon Hamilt » Thu, 18 Nov 1993 09:48:56



>Since installing Solaris 2.2 on an IPC which formerly ran 1.1, it has
>been* nightly.   The system responds to pings, but not much
>else.
>After days of testing, putting periodic writes to syslog in the
>crontab, etc, I found that the following line, included in the default
>root crontab, is the culprit:
>15 3 * * * find / -name .nfs\* -mtime +7 -exec rm -f {} \;
>-o -fstype nfs -prune"
>This line which causes no problems on the Solaris 1.1 systems, brings
>the 2.2 system to its knees.  The disk can be heard clicking away even
>after the system has hung, implying that the find never finishes.
>Any ideas?  Familiar problem?  Known bug?

Familiar, yes.  My particular solution was to use gnu find instead of the
one shipped with the system, and everything has been fine since.  

Quote:>system.  I keep the Answerbook CD mounted all the time.

This has also been known to cause problems due to a memory leak in the HSFS
driver.  Have you applied the relevant patch?  Sorry, I can't quote you
the patch number; I have punted most of them since moving to 2.3.

--
+----------------------------------------------------------------+

|   CS Solaris Systems Support Group, Iowa State University      |
+----------------------------------------------------------------+