Quote:>The idea is that it only makes a single write to the file system rather than
>a second
>to uptdate the inode. The timestamp etc. is only updated every x number of
>writes.
>I have implemented it on Sequent platforms but never noticed a measurable
>difference -
>probably because my systems are I/O bound on reads rather than writes.
There's more to it than that. At least, once upon a time there was.
I am feeling rather energetic this evening, and was able to locate the
following email on the subject. The author would seem to be in a position
to know what he is talking about.
Karl
-----------------------
Date: Sun, 15 Oct 1995 16:48:21 +0000 (GMT)
From: uunet!sequent.com!mikeg (Mike Glendinning)
Subject: Re: II_DIRECT_IO
To: uunet!MATH.AMS.ORG!info-ingres
Organization: Sequent Computer Systems, Inc.
Newsgroups: comp.databases.ingres
Status: OR
A few weeks ago, somebody was asking about II_DIRECT_IO. I've long lost
the original post, but here's the answer anyway!
II_DIRECT_IO is an environment variable that is only available on Sequent
Dynix/PTX platforms (our variant of Unix). It determines whether Ingres
uses normal file system I/O calls (read, write) for database operations or
the Sequent specific direct I/O library (DIO_Read, DIO_Write).
If the variable II_DIRECT_IO is set to 'y' (a lowercase 'y' - nothing
else will do!) then the direct I/O calls will be used. Otherwise, the
standard Unix system calls are used.
You can determine whether direct I/O is in use by looking at the II_RCP.LOG
file. There will be a line "IIdio: direct i/o" or "IIdio: regular file i/o"
as appropriate.
The Sequent direct I/O calls effectively bypass the Unix buffer cache
mechanism and read/write directly to disk through the SCSI disk device
driver. The interface also supports true asynchronous I/O, at least on
raw disk partitions. There are a number of issues.
Without direct I/O, Ingres must use the O_SYNC flag to force Unix 'write'
operations to be synchronous (that is, the data is forced to disk, and
not just written to the buffer cache). A performance issue here is that
with the Berkely Fast File System (which most Unix vendors use), the O_SYNC
flag also forces an update of the I-Node (the on-disk data structure describing
the file). Thus, each 'write' to disk becomes two writes! Direct I/O avoids
this synchronous I-Node update by deferring it, just as with regular (non
O_SYNC) writes. Sun's versions of Unix (SunOS, Solaris) have another way
around this problem, involving the "sticky" bit on the file, which Ingres
uses also.
Another issue is that the regular system calls only allow a single 'write'
operation to be issued concurrently against the same file (although we do
allow multiple readers). Direct I/O avoids this restriction, on the
assumption that the database system is in control and knows what it is doing!
Consequently, there is no contention for the I-Node lock when multiple
processes attempt to write to the same file.
Actually, for various reasons, Ingres always uses ordinary 'read' calls
in conjunction with direct I/O writes. That is, 'read' is always used
instead of 'DIO_Read'. The direct I/O library keeps the buffer cache
in synchronisation with direct I/O operations, so this is safe.
If you perform a single user test with and without direct I/O, you may
well find that direct I/O is slightly slower (uses more CPU). I think
this is due to the fact that Ingres does not guarantee alignment of the
I/O buffers on a 16 byte boundary (this is a requirement for the SCSI
disk driver that is exposed by the direct I/O interface). The bottom
layer of Ingres' DMF must therefore copy the data to an aligned buffer
before issuing the I/O.
This would be worth somebody in CA looking at and fixing if necessary!
The last thing that 6.4 needs is *another* copy of the data on its route
to/from disk!
For a busy, typical Sequent system with many servers, slaves and users,
direct I/O will almost certainly be preferable, and I recommend systems
are set up that way.
Oracle uses the same techniques as Ingres, although there are also options
to force direct I/O reads, and for raw partitions to use the "real"
asynchronous I/O facilities of the direct I/O interface.
If you have Dynix/PTX 4.1.x, the direct I/O library specification is now
public and you can see it by typing "man DIO". Previously it was only
available to our database software partners.
----------------------------------