> > If, however, you are doing something like this
> > start app
> > (performance OK)
> > cp .....
> > (performance reduced)
> Yes, it's mainly this third scenario but:
> start app
> (Performance OK)
> cp .... &
> (Performance very poor (but this seems reasonable)
Here you'll be reading stuff from disk and probably paging some of
your application out.
Quote:> ... cp finished
> (Performance still reduced !)
With lots of RAM cp will finish with most of the data it wrote still
in memory - as your application runs and touches pages these will have
to be written out before your app can be paged back in hence the
slowdown after cp has finished.
> I had no clue for this strange behaviour, so I wrote a program
> which creates a large file (kill.me). Then it makes random reads and
> writes to that file. The following happens:
> start app
> (Perf. OK)
> (Perf. very poor)
> ...disktest finished
> (Perf. reduced)
OK - up to this point you're in the same situation as with the cp
Quote:> del kill.me
> (Perf. OK !!)
Now you've just marked all the dirty pages belonging to kill.me as not
dirty anymore. When you need them they can just be zeroed out and
given to you (or filled from disk) without needing to write their
> If only 64MB are installed, the performance reaches normal level shortly
> after disktest exits.
I'd still like to see some absolute figures - I'm pretty sure that
with 64Mb performance will be uniformly lower than with 512.
> > If you are in this scenario then look at what the mlock call does -
> > you can use it to "fix" some pages in memory and they won't get paged
> > out when you do the cp.
> I can't try mlock --- the app. came without source :-|
Ahh, I see <insert lecture about why source availability is a "good thing">
OK then either add a sync after the copy - this will write all the dirty
pages out and they'll then be available again. Performance will still be
low during the cp+sync and you'll still take a hit as the system pages the
app back in but it should all be over fairly quickly.
If the application is dynamically loaded you may be able to use LD_PRELOAD
to wrap brk/sbrk and mmap calls and lock the processes memory in RAM
The unified cache is good when you don't know in advance whether programs
will use lots of memory or do lots of block I/O but it hurts in cases such
as yours when you want to attach a specific priority to the use of memory
Maybe the answer is a memory priority value similar to process "niceness"
which would allow you to run your process at a higher memory priority then