mmap and external file accesses

mmap and external file accesses

Post by Conrad Sabatie » Wed, 08 May 2002 13:13:22



I'm about to take my first stab at using mmap() and friends in a project
I'm working on.  One thing I'm wondering about, though:  what happens if
some other process tries to write to a file that is currently mmapped?  
Will the image in memory be automatically updated?  Will the external
writes go unnoticed by the process using mmap?  What happens if I try to
do a subsequent msync() after the external process has written something?
How does one avoid any possible calamities?

Thanks for any insights.
--

 
 
 

mmap and external file accesses

Post by David Schwart » Wed, 08 May 2002 15:06:23



> I'm about to take my first stab at using mmap() and friends in a project
> I'm working on.  One thing I'm wondering about, though:  what happens if
> some other process tries to write to a file that is currently mmapped?

        Then the file is modified.

Quote:> Will the image in memory be automatically updated?

        There is no image in memory. The file itself is mapped into memory.

Quote:> Will the external
> writes go unnoticed by the process using mmap?

        I'm not sure I understand what you mean. The file will be modified. The
file itself (its contents) are mapped into the process' memory. The
'mmap' function actually maps a file's contents into the memory space of
a process (assuming a share mapping).

Quote:> What happens if I try to
> do a subsequent msync() after the external process has written something?

        Read closely the man page for 'msync'. It has nothing to do with the
mapping itself. It flushes changes made to the cached copy of the file
to disk.

Quote:> How does one avoid any possible calamities?

        The same way you do for the write and read functions. The 'mmap' system
call just makes file access easier, it doesn't really change the
semantics.

        DS

 
 
 

mmap and external file accesses

Post by Conrad Sabatie » Wed, 08 May 2002 21:13:45





>> I'm about to take my first stab at using mmap() and friends in a project
>> I'm working on.  One thing I'm wondering about, though:  what happens if
>> some other process tries to write to a file that is currently mmapped?

>    Then the file is modified.

>> Will the image in memory be automatically updated?

>    There is no image in memory. The file itself is mapped into memory.

>> Will the external
>> writes go unnoticed by the process using mmap?

>    I'm not sure I understand what you mean. The file will be modified. The
>file itself (its contents) are mapped into the process' memory. The
>'mmap' function actually maps a file's contents into the memory space of
>a process (assuming a share mapping).

Hmmm, OK.  I think I had the wrong mental picture of how mmap works.  I
was assuming that the data in memory would only reflect the file's
contents at the time it was first mmapped (plus any later changes made by
the program using mmap).

So you're saying then that mmap dynamically updates in the in-memory
version of the file, regardless of which process is writing to the file,
correct?

Interesting.
--

 
 
 

mmap and external file accesses

Post by Martin Jos » Wed, 08 May 2002 22:20:33



> So you're saying then that mmap dynamically updates in the in-memory
> version of the file, regardless of which process is writing to the file,
> correct?

> Interesting.

Beware !
This depends on the OS you are using.
OSes using shared virtual-memory/buffer cache will behave like this.
FreeBSD does this since long ago, Linux since not so long ago as
FreeBSD (IIRC)
HPUX OTOH doesn't do it at all up to 11.00. Now a patch for 11.00 to
makes the changes visible. (I don't know about 11i)
You see: "It depends". Probably the best way to go, is to have a good
look at the man-page.

HTH

Martin

 
 
 

mmap and external file accesses

Post by Andrew Giert » Wed, 08 May 2002 20:55:10


 Conrad> I'm about to take my first stab at using mmap() and friends
 Conrad> in a project I'm working on.  One thing I'm wondering about,
 Conrad> though: what happens if some other process tries to write to
 Conrad> a file that is currently mmapped?  Will the image in memory
 Conrad> be automatically updated?

That depends on the system.

There are basically two cases. The first is those systems which have
what's generally known as "unified" VM systems; these systems use the
same physical memory for mapped file pages that they use for normal
filesystem access to the same file. Therefore changes made via the
mapped region are immediately visible to read(), and changes made via
write() are immediately visible through the mapped region. msync() is
not needed on such systems unless you need to know that data has been
physically written to disk (like fsync() for files).

(It's also possible to get this level of consistency without actually
having a fully unified VM, but I'm not aware of any significant
systems that do so.)

However, this behaviour is _NOT_ guaranteed by standards, and a number
of systems don't support it. The standards require an appropriate
intervening msync() call before updates made through the mapped region
are guaranteed to be visible via read(), and before updates made via
write() are visible in the mapped region. This is often a source of
problems when porting code written originally on a unified-VM system
to a non-unified one.

--
Andrew.

comp.unix.programmer FAQ: see <URL: http://www.erlenstar.demon.co.uk/unix/>
                           or <URL: http://www.whitefang.com/unix/>

 
 
 

mmap and external file accesses

Post by Nils O. Sel?sd » Wed, 08 May 2002 23:47:56




>> So you're saying then that mmap dynamically updates in the in-memory
>> version of the file, regardless of which process is writing to the file,
>> correct?

>> Interesting.

> Beware !
> This depends on the OS you are using.
> OSes using shared virtual-memory/buffer cache will behave like this.
> FreeBSD does this since long ago, Linux since not so long ago as
> FreeBSD (IIRC)

And it also depends on how you mmap it, e.g. shared or private..
 
 
 

mmap and external file accesses

Post by Conrad Sabatie » Thu, 09 May 2002 11:27:02






>>> So you're saying then that mmap dynamically updates in the in-memory
>>> version of the file, regardless of which process is writing to the file,
>>> correct?

>>> Interesting.

>> Beware !
>> This depends on the OS you are using.
>> OSes using shared virtual-memory/buffer cache will behave like this.
>> FreeBSD does this since long ago, Linux since not so long ago as
>> FreeBSD (IIRC)
>And it also depends on how you mmap it, e.g. shared or private..

Right.  I was planning on using the MAP_SHARED flag, as my whole purpose
in wanting to use mmap for this particular project is to make it easier
for multiple forked copies of the same process to manage a certain file.

Fortunately, I'm doing this under FreeBSD, too.  :-)  While portability is
certainly something I may be interested in later on, for now, I'm just
looking for the best performance on my own system.

I'll definitely write some test code first to make sure I have a handle on
this stuff before incorporating it into my program.

Thanks for all the feedback, folks!
--

 
 
 

mmap and external file accesses

Post by Kurtis D. Rade » Thu, 09 May 2002 13:02:31




> (It's also possible to get this level of consistency without actually having
> a fully unified VM, but I'm not aware of any significant systems that do
> so.)

Historical footnote: the DYNIX/ptx operating system from Sequent Computer
Systems (now part of IBM) works that way. I'd certainly consider DYNIX/ptx
significant in as much as quite a few of the largest databases in the world
ran on that OS; although, I'll grant you the number of systems running it
never approached the numbers for AIX, SunOS or HPuX. And historical in as far
as the OS is no longer under active development (end of service support is
currently scheduled for the end of 2006).
 
 
 

1. mmap for fast file access?

Hi,

I have an application that involves parsing a large number of text
files (say 1000 of say 60-100Mb each) and I want to do this as quickly
as possible. At the moment I just read the files line by line into a buffer
(using the C++ getline function). I'd like to speed this up by reading
in the files using memory mapping instead, and the Compaq realtime
programmers guide is pretty clear on how to do this using mmap, but:

1. How much speedup am I likely to get by using memory mapping?

2. When reading a file sequentially, is it faster to mmap it into
memory page by page as needed, or mmap it all at once (presumably the latter,
if you  have enough memory)?

3. Would it be faster to use a shared memory approach instead, e.g. shmat?

The machine I'm likely to run this on uses Tru64 UNIX 5.1.

Thanks in advance,

Tony Cox

--
 -------------------------------------------------------------
  Dr. Anthony J. Cox              -.   .-. .-.   .-. .-.   .-
  Informatics Division,           ||\ /|||X|||\ /|||X|||\ /||  
  The Sanger Centre,              |||X|||/ \|||X|||/ \|||X|||
  Hinxton, Cambridge CB10 1SD, UK `-' `-'   `-' `-'   `-' `-'
   -.   .-. .-.   .-. .-.   .-
   ||\ /|||X|||\ /|||X|||\ /||    ac2 at sanger dot ac dot uk
   |||X|||/ \|||X|||/ \|||X|||    
   `-' `-'   `-' `-'   `-' `-'
 -------------------------------------------------------------

2. sync comm. card with sync-ppp driver?

3. Use of mmap to access file

4. Modelines for viewsonic14es (fwd)

5. how to make external host access internal Ftp files on FTP server?

6. WINS server ip address

7. Use of an external file listing hostnames for access control

8. Anybody have poll example ?

9. Matrox Mystique ands X.

10. File Access - Does Owner Access Override Group Access?

11. "ld: fatal: file /dev/zero: cannot mmap file: Not enough space"... but why ?

12. Not setting atime on files (was ... mmap() the active file)

13. mmap() with shared access fails over NFS using AIX 3.2 on RS 6000