file(1) vs utime(3)

file(1) vs utime(3)

Post by Fuat C. Bar » Thu, 22 Aug 1991 10:26:52



We were investigating why our incremental backups had so much data on
them, and we traced it to the fact that among other things, our daily
system audits run /usr/bin/file on a lot of files.  This changes the
inode modify time since file(1) calls utime(3) to put back the last
access time.  Any idea why the file program needs to do this?  Isn't
running file(1) considered an access of the file?  The Encore dump(1)
writes out all files that have had their inode change times updated.

                                                --Fuat

P.S.  This is on Sun-4s running SunOS 4.1.1 as well as an Encore
Multimax running Umax 4.3 (BSD 4.3 derivative).



    UUCP: ...!rutgers!columbia!cunixf!fuat      712 Watson Labs, 612 W115th St.
   Phone: (212) 854-5128  Fax: (212) 662-6442   New York, NY 10025

 
 
 

file(1) vs utime(3)

Post by Boyd Rober » Thu, 22 Aug 1991 20:58:30



Quote:> We were investigating why our incremental backups had so much data on
> them, and we traced it to the fact that among other things, our daily
> system audits run /usr/bin/file on a lot of files.  This changes the
> inode modify time since file(1) calls utime(3) to put back the last
> access time.  Any idea why the file program needs to do this?  Isn't
> running file(1) considered an access of the file?  The Encore dump(1)
> writes out all files that have had their inode change times updated.

Be reasonable.  Reading the file will change its access time and that
will change the inode change time.  Now, that may well cause dump to
dump the file.  No one would write file(1) (or any other utility for that
matter) to utime the file!  That's insane (except for touch(1), but that's
its job).


``When the going gets wierd, the weird turn pro...''

 
 
 

file(1) vs utime(3)

Post by Fuat C. Bar » Thu, 22 Aug 1991 22:06:18



>Be reasonable.  Reading the file will change its access time and that
>will change the inode change time.  Now, that may well cause dump to
>dump the file.  No one would write file(1) (or any other utility for that
>matter) to utime the file!  That's insane (except for touch(1), but that's
>its job).

I wasn't guessing when I said file(1) called utime(3).  I read the
source code...

                                                --Fuat



    UUCP: ...!rutgers!columbia!cunixf!fuat      712 Watson Labs, 612 W115th St.
   Phone: (212) 854-5128  Fax: (212) 662-6442   New York, NY 10025

 
 
 

file(1) vs utime(3)

Post by Bob Wakehou » Fri, 23 Aug 1991 01:35:09



> This changes the inode modify time since file(1) calls utime(3) to
> put back the last access time.  Any idea why the file program needs
> to do this?  Isn't running file(1) considered an access of the file?

Running file(1) is not considered an access of the file, probably on
the same concept that ls(1) is not considered an access of the file.
It actually makes some sense to me.  The file(1) command is not intended
to access data -- it just must to do its job.  We wouldn't want access
times modified with every ls(1), probably, and I think that has some
relevance with file(1), as well.

Unfortunately, file(1) does have to access the data.  It does its best
to hide that fact, but utime(3) cannot reset inode-changed time, as it
does access time.

The fact that file(1) uses utime(3) causes an interesting problem with
NFS access between Sun Sparcstation and Tektronix XD88 (SYSV derivative)
workstations.  When a utime(3) process on Sun applies to a file on XD88,
the XD88 misapplies the utime(3) command and sets the file's access
permissions to 777 -- a feature that is not always desirable.  I'd like
to hear from anyone who has seen similar problems and looked into them
any farther than I have so far (not far at all).

Bob Wakehouse

Tektronix, Inc.
Beaverton, Oregon

 
 
 

file(1) vs utime(3)

Post by Fuat C. Bar » Fri, 23 Aug 1991 02:20:52



>Running file(1) is not considered an access of the file, probably on
>the same concept that ls(1) is not considered an access of the file.

Actually, if you run ls(1) on a directory, the directory's last access
time will be updated.  ls(1) doesn't try to fake it out by calling
utime(3) to reset it...  ls(1) on a file though does not change the
file access time, since it is only reporting info from a stat(2) for
the specified file.

Quote:>It actually makes some sense to me.  The file(1) command is not intended
>to access data -- it just must to do its job.  We wouldn't want access
>times modified with every ls(1), probably, and I think that has some
>relevance with file(1), as well.

I sort of see your point, but don't quite agree with what file(1)
thinks it needs to do.  If a program needs to read some number of
bytes from a file to do its job, then it is accessing the file, and
thus the file's last access time should reflect that and not be
tampered with.  Especially when such tampering causes the inode change
time to be updated instead...

I have been informed that BSD 4.3-Reno's file(1) no longer fiddles
with the file access time.

                                                        --Fuat



    UUCP: ...!rutgers!columbia!cunixf!fuat      712 Watson Labs, 612 W115th St.
   Phone: (212) 854-5128  Fax: (212) 662-6442   New York, NY 10025

 
 
 

file(1) vs utime(3)

Post by Boyd Rober » Fri, 23 Aug 1991 02:30:06



> Running file(1) is not considered an access of the file, probably on
> the same concept that ls(1) is not considered an access of the file.
> It actually makes some sense to me.  The file(1) command is not intended
> to access data -- it just must to do its job.  We wouldn't want access
> times modified with every ls(1), probably, and I think that has some
> relevance with file(1), as well.

So, some read(2)'s are less [sic] equal than others?  And part of file(1)'s
job is not to read the file and determine its file type?

Since when did ls(1) read the data in the file?  The `access time' lives
in the inode and refers to the `access time' of the data _in the file_
and _not_ the inode.  There is no `inode access time'.  Holy chicken and egg!


``When the going gets wierd, the weird turn pro...''

 
 
 

file(1) vs utime(3)

Post by Guy Harr » Fri, 23 Aug 1991 05:50:14


Quote:>Isn't running file(1) considered an access of the file?

The person at AT&T who made "file" do that presumably thought that
running "file" *shouldn't* be considered an access of the file.  There
then presumably exists at least one person who thinks so; there are
others who don't think so.
 
 
 

file(1) vs utime(3)

Post by Guy Harr » Sat, 24 Aug 1991 07:12:21


Quote:>I have been informed that BSD 4.3-Reno's file(1) no longer fiddles
>with the file access time.

BSD's "file" command never fiddled with the file's access time, ever,
except by reading the file; using "utime()" to fiddle with the access
time is a System V-ism.
 
 
 

file(1) vs utime(3)

Post by Doug Gw » Sat, 24 Aug 1991 17:14:18



>Running file(1) is not considered an access of the file, probably on
>the same concept that ls(1) is not considered an access of the file.
>It actually makes some sense to me.  The file(1) command is not intended
>to access data -- it just must to do its job.  We wouldn't want access
>times modified with every ls(1), probably, and I think that has some
>relevance with file(1), as well.
>Unfortunately, file(1) does have to access the data.  It does its best
>to hide that fact, but utime(3) cannot reset inode-changed time, as it
>does access time.

What in the WORLD are you talking about?  The "ls" utility does NOT read
any files other than directories, therefore there is no "access" to be
undone.

The "file" utility does indeed restore atime and mtime (only atime was
changed).  It does not affect the ctime of the inode, so there is no
need to "reset" it.  I would argue that "file" should not be doing this
fakery of the atime.  If the file is accessed, then its "time of last
access" ought to reflect that.

Quote:>The fact that file(1) uses utime(3) causes an interesting problem with
>NFS access between Sun Sparcstation and Tektronix XD88 (SYSV derivative)

Damn near any interesting operation on a file runs afoul of some NFS
implementation or another.  NFS should be junked.
 
 
 

file(1) vs utime(3)

Post by Doug Gw » Sat, 24 Aug 1991 17:48:26



>The "file" utility does indeed restore atime and mtime (only atime was
>changed).  It does not affect the ctime of the inode, so there is no
>need to "reset" it.

Oops!  As somebody else pointed out, the ctime was not changed UNTIL
"find" tried to restore the "atime", which causes a "ctime" change.
One more argument in favor of "file" not trying to be so darn clever.
 
 
 

file(1) vs utime(3)

Post by Boyd Rober » Sat, 24 Aug 1991 21:20:10



> [...] using "utime()" to fiddle with the access
> time is a System V-ism.

Correct.

I read that line and wept.  That sort of stuff really makes you wonder.


``When the going gets wierd, the weird turn pro...''

 
 
 

file(1) vs utime(3)

Post by Bob Wakehou » Sun, 25 Aug 1991 02:46:24


[In regard to this comment from me:]

<< Running file(1) is not considered an access of the file, probably on
<< the same concept that ls(1) is not considered an access of the file.

Quote:> What in the WORLD are you talking about?  The "ls" utility does NOT read
> any files other than directories, therefore there is no "access" to be
> undone.

Poor syntax on my part.  I did not, and did not mean to, claim that "ls"
accesses files.

In a sense, both "ls" and "file" exist to provide information about the
file.  It can be argued that "file" is primarily intended to identify the
"contents", while "ls" is primarily intended to identify the "container",
yet the intent of both is to provide information about the file or contents,
not to affect the file or contents.  It is incidental that "file" must
access data to determine its type (as opposed to, say, having a "type"
entry in the inode).

A perspective difference, I guess.  I suppose "file" can be thought of
either as "tell me what kind of file that is" or as "read that file and
tell me what kind of data you see".

I'm not wholeheartedly endorsing anything.  I just don't think "file's"
behavior is necessarily so terribly irrational, nor that access times
are reliably sacred even without "file's" meddling.

Bob Wakehouse

Beaverton, Oregon

 
 
 

file(1) vs utime(3)

Post by Leslie Mikese » Sun, 25 Aug 1991 00:05:22



>>The "file" utility does indeed restore atime and mtime (only atime was
>>changed).  It does not affect the ctime of the inode, so there is no
>>need to "reset" it.
>Oops!  As somebody else pointed out, the ctime was not changed UNTIL
>"find" tried to restore the "atime", which causes a "ctime" change.
>One more argument in favor of "file" not trying to be so darn clever.

Yes, changing ctime (which you must do as a side effect of pretending
you didn't read a file by diddling its atime) is a truely horrible
thing to do.  Anyone trying to do incremental backups should be basing
them on ctime since otherwise you don't catch files that have been
renamed.  You probably don't want to add your whole disk to the next
incremental just because someone is browsing around with "file".

BTW, this is a generic problem of unix, since making backups does
the same thing.  Either you lose track of "real" accesses by letting
cpio modify the atime with a normal read, or you lose the ability
to make usable incrementals by using the -a option to reset the atime.
Going across a read-only network mount seems like the only way to
currently get this right.

Les Mikesell

 
 
 

file(1) vs utime(3)

Post by Geoff Cla » Sun, 25 Aug 1991 01:54:16



>  Reading the file will change its access time and that
>will change the inode change time.

Actually, reading only updates the access time, not the inode change time.

This reminds me of something that has been niggling me for a long time.
Why is it that read() updates only st_atime, but write() updates both
st_mtime and st_ctime?  Seems inconsistent to me, and I can't see any
reason for write() to update st_ctime, but I suppose there must have been
one.
--

UniSoft Limited, London, England.   Tel: +44 71 729 3773   Fax: +44 71 729 3273

 
 
 

file(1) vs utime(3)

Post by Rahul Dhe » Sun, 25 Aug 1991 05:54:16



Quote:(Leslie Mikesell) writes:
>BTW, this is a generic problem of unix, since making backups does
>the same thing.  Either you lose track of "real" accesses by letting
>cpio modify the atime with a normal read, or you lose the ability
>to make usable incrementals by using the -a option to reset the atime.

(Aside:  It's much more efficient to do backups by reading the raw
device, and it doesn't change inode timestamps.)

The real question is this:  If we agree that certain access aren't
really accesses, then are we not opening a Pandora's box?

Hypothetical conversation follows.  Any resemblence to any operating
system, living or dead, may or may not be coincidental.

   "Why does file(1) restore access times?"

   "Because file doesn't really access the file.  Well, I mean, it does,
    but it's not supposed to, and the user doesn't want the access time of
    the file to change."

   "So you think access times should be updated only when you make an
    access that you are supposed to, but not when you make an access that
    you aren't supposed to?"

   "Yes, but not quite.  Let's rephrase that.  Access times should not be
    updated when you make an access that you wish you didn't have to."

   "Hmmm...that is rather subjective, isn't it?  I wish I didn't have to
    access any file at all, since I get all of them over NFS.  So when I
    access them should the access times not be updated?"

   "No, no, you are misunderstanding.  There are some things, like
    deciding what the contents of a file are -- as the file(1) utility
    does -- that should not really require you to acces the file.  There
    are other things, like compiling C source, which really do require you
    to access the file."

   "So you think you shouldn't have to look inside a file to figure out
    what it contains?"

   "Er...no."

   "Well, I don't think I should have to look inside a file to compile
    it.  When I get something off Usenet, all I want to do is compile it
    and make it run.  I don't care what the contents of the files are."

   "Well, I supposed in that case there should be a switch that tells the
    C compiler to reset access times."

   "Wouldn't that cause my C sources to be backed up just because I
    recompiled them?"

   "Maybe we need a fourth time field for backups.  Then, when you reset
    the access time field, it would update the file status change time
    field, but not the backup time field."

   "What would update the backup time field?"

   "It would be updated by the same things that updated the status change
    time field, *except* the utime system call."

   "Great idea!"

   "Wait a minute, I have a better idea.  Why don't we change the utime
    system call so it never updates the status change time?"

   "Another great idea!  Both ought to be in System V Release 5.  Which
    one will be used can depend on an option in /etc/fstab, and we can
    provide a compatibility library so old stuff will still work."

   "And then we can fix file(1) so it doesn't change access times
    any more!"

   "No, no!  It still needs to reset access times.  What will happen
    in SVR5 is that the resetting of the access time won't update the
    backup time field."

   "Do you think SVR5 will also provide a switch for the C compiler
    so it won't change access times?"

   "Why not?  We could make it the default behavior."
--

UUCP:  oliveb!cirrusl!dhesi
"You're even nuttier than we've come to expect of you." -- Doug Gwyn

 
 
 

1. utime vs utimes

My application is sort of a backup program. It stores a file's contents
in a database along with its metadata (owner/mode/mtime/etc) on a server
machine. At some later time, if needed, the file is restored to its
original state.

I've realized the current implementation is throwing away any sub-second
resolution in the last-modified time. Some platforms store both seconds
and microseconds (or nanoseconds in the case of Solaris), but the
st_mtime member retrieved by stat() only has the seconds part.

So I'm looking for two things: (1) the most portable/standard way of
getting the micro/nano second part from the stat structure and (2) the
best way to set that metadata back onto the newly restored file.

For the 'get' part, it looks like there's no standard API. Looks like a
bunch of ugly ifdefs are required. Does anyone have a sample piece of
code that handles the major platforms or just shows the way?

The 'set' part raises a question about the intention of SUS. The utime()
function is in the standard but is explicitly limited to seconds (SUSv3:
"The times in the structure utimbuf are measured in seconds since the
Epoch."). The utimes() function is also in SUS and allows microsecond
resolution ("The times in the timeval structure are measured in seconds
and microseconds since the Epoch"), but this interface is marked LEGACY
with a usage note indicating "For applications portability, the utime()
function should be used instead of utimes()". I don't get it - why would
the more capable interface be retired in favor of a dumber one? Is there
a non-legacy interface that supersedes utimes() and support setting
sub-second timestamps?

--
Thanks,
M.Biswas

2. suppress output of at job

3. don't allow utime()/utimes() on immutable/append-only files

4. DSL on Linux 2.2.20: internet getting slower

5. what is truss? (WAS Re: file(1) vs utime(3))

6. modem dialing out

7. utime & utimes & POSIX & portability

8. To Peter Mitchell: follow-up question about Samba

9. Linux vs OS2 vs NT vs Win95 vs Multics vs PDP11 vs BSD geeks

10. How to change utime of a file

11. utime(...) works only with files of owner

12. Buffered file reads, and speeds of C prog vs. wc vs. awk vs. perl

13. "Standard Journaled File System" vs "Large File Enabled Journaled File System"