I have a mod_perl script that opens a log file once per child process --
that is, the file remains open from request to request for the life of
the child. The file is opened to flush after each write ( $| = 1 in
perl).
When a child wants to write it flock's LOCK_EX, does a one line print,
and then flock LOCK_UN. (Perl also flushes on flock LOCK_EX and
LOCK_UN).
Every night I archive the log file. I have another program that opens
the same file, grabs a LOCK_EX, waits two seconds, then copies the file.
As a check, I then check that the original log file and the copy are the
same size (I'm still holding the LOCK_EX).
For two months this has been working fine, but last night the program
reported that the original was larger than the copy -- so it seems as if
the file was written to after I had a LOCK_EX. The mod_perl application
is the ONLY program writing to this file, and the locking does work
properly.
Is it possible that Solaris is buffering the writes to disk longer than
the two seconds I wait after grabbing the LOCK_EX?
The only think I can think of is that the mod_perl application wrote a
line, and then did a flock LOCK_UN, and then my archive program grabbed
the LOCK_EX, waited two seconds, I copied the file, then Solaris flushed
to the log file. Is this possible?
--
pls note the one line sig, not counting this one.