How to increase write speed to local hard drive?

How to increase write speed to local hard drive?

Post by Daniel Roc » Tue, 27 Jun 2006 06:41:44





>>If you want to forcibly turn write cache off you have to put:

>>set ata:ata_write_cache = -1

>>in /etc/system

> Which has no effect in Solaris 8 or 9 or 10 (except perhaps some
> updates) or on SPARC.

Hmm, the following file:

server:/mnt/platform/i86pc/kernel/drv# what ata
ata:
        SunOS 5.10 Generic_118844-26 November 2005

defines the variable and the function. Maybe they are not connected in the
source tree buy they definitely are there:

server:/mnt/platform/i86pc/kernel/drv# nm ata | grep write_cache
[164]   |     29471|     233|FUNC |LOCL |0    |1      |ata_set_write_cache
[444]   |       624|       4|OBJT |GLOB |0    |4      |ata_write_cache

This is extracted from the x86.miniroot (S10U1).

--
Daniel

 
 
 

How to increase write speed to local hard drive?

Post by tunl » Tue, 27 Jun 2006 07:32:27



> With 8 MB it seems that you are talking about the harddrive cache. This
> cache will not be longer used then a few ms until one rotation is done.
> IDE drives normally only use this for some cylinder read ahead, write
> behind caching.

> SCSI are more stupid here (called clever in the past where the OS were
> stupid). But normally the time is very low  - measured in milliseconds.

 Sigh......

  Read ahead is fine  never has been any problems with that.
Write behind Caching  is the potential black hole.
During the Milliseconds when data is in the cache , it is at risk.
and if you have a busy system  SOME DATA IS almost ALWAYS
in the CACHE.
hence it does not matter if a singel record is just a few msec in the
cache.

THE POINT IS that WHENEVER you have a power failure you WILL  loose
the data thats in the cache at that time.

If we are talking a logging file system like EXT3  this is the
difference between a
log replay restart and a FULL FSCK restart. not counting the data that
has vanished into the great bit bucket.

This is why I say that data security is sacrificed for speed in cheep
pc solutions.

Unless this cache is battery backed up ofcourse as in BIG SAN Arrays.
But we are discussing  $50   IDE disks  and they dont have battery
backed up write caches. And they are very seldom on UPS powersupply.
because the UPS costs more than the PC.

   //Lars

 
 
 

How to increase write speed to local hard drive?

Post by llotha » Tue, 27 Jun 2006 14:21:41



> But we are discussing  $50   IDE disks  and they dont have battery
> backed up write caches. And they are very seldom on UPS powersupply.
> because the UPS costs more than the PC.

???
You should change your Hardware shop.
I have a small UPS with a batterie that keeps my headless U10 Server up
for one hour. This costs 60 Euro. It has a serial port interface that
allows a clean shutdown.

On my workstation with power consuming cpu, i have the same UPS
version, which works only a few minutes but i do a suspend to disk
immediately on power failure, so it just needs up a few seconds.

If you have sensible data you should always have at least such a low
cost solution.

 
 
 

How to increase write speed to local hard drive?

Post by llotha » Tue, 27 Jun 2006 14:28:27


Quote:> And milliseconds count, in fact: it means that every time the system
> does a write it needs to wait until the write is safely on disk if it
> wants to be safe, which perhaps takes several ms, which might easily be
> more than the time to actually write the data.

If PC's were designed well it wouldn't have an influence because the
condensator in your power supply are able to keep the power for a few
hundert milliseconds. Enough time for the disk to write it, but the OS
does not know that it must immediately stop queuing new data.

Quote:> SCSI actually is much smarter, because the tagged-queuing stuff lets
> the system have several writes in flight at once to the disk, with the
> disk notifying when they complete, so it can often hide the latency,
> without needing to rely on an unreliable write-cache (it's more-or-less
> the same trick that processors do to hide latency).  I think that SATA
> disks (and maybe just plain ATA ones) can do this now too.

I was thinking about one of the SCSI features that allows the
controller to map bad blocks to different locations on the disk without
 notifying the OS about this change (because in mid 90 the OS didn't do
anything usefull with this information).
So even when the OS thinks that blocks are close together SCSI disks
might need a long way to reach the block (and killing the elevator
algorithm in the OS). I hope that this SCSI feature is now disabled by
default.

In this scenario we have hunderts of milliseconds in the worst case.

 
 
 

How to increase write speed to local hard drive?

Post by Tim Bradsha » Tue, 27 Jun 2006 15:56:48



> If PC's were designed well it wouldn't have an influence because the
> condensator in your power supply are able to keep the power for a few
> hundert milliseconds. Enough time for the disk to write it, but the OS
> does not know that it must immediately stop queuing new data.

Well, as you say, the issue is that systems aren't well designed, or
cheap one's aren't! So to be safe on cheap systems you have to be
cautious, and Solaris is a bit more interested in being safe than linux
typically is.  It's kind of the definition of a non-cheap system that
it deals with issues like this properly, so disks can have write caches
which can be used, memory has ECC &c &c...

Quote:

> I was thinking about one of the SCSI features that allows the
> controller to map bad blocks to different locations on the disk without
>  notifying the OS about this change (because in mid 90 the OS didn't do
> anything usefull with this information).
> So even when the OS thinks that blocks are close together SCSI disks
> might need a long way to reach the block (and killing the elevator
> algorithm in the OS). I hope that this SCSI feature is now disabled by
> default.

I think this is actually OK - if a small number of sectors are remapped
then the bad case will happen only very rarely, which is OK.  if a
*large* number are remapped, then chances are very high the disk is
about to die anyway :-)

--tim

 
 
 

How to increase write speed to local hard drive?

Post by Tim Bradsha » Tue, 27 Jun 2006 15:59:37



> If we are talking a logging file system like EXT3  this is the
> difference between a
> log replay restart and a FULL FSCK restart. not counting the data that
> has vanished into the great bit bucket.

That should never be the case for a proper logging FS - writes should
always happen in a good order so that, so long as the disk commits the
writes in the order they're issued, the FS is in a good state, however
many might be lost after the last one committed.  Of course disks might
commit out of order I guess (and not tell you).

--tim

 
 
 

How to increase write speed to local hard drive?

Post by Casper H.S. Di » Tue, 27 Jun 2006 18:09:02




>> ZFS will now enable write caches if it owns the whole disk and it
>> will flush the cache at the appropriate moments.
>hm let's assume solaris support's ZFS boot in the near future, i.e. you
>can install right to a zfs /, /usr, /var etc.; what about swap? you need
>a partition/slice for swap so zfs on such a system will never own the
>whole disk, so no whole disk = no write cache = lower performance?

On single disk systems; or you could enable the write cache yourself;
surely, losing swap space is fairly uninteresting when the system goes
down.

Quote:>or are there plans to intergrate swap handling into zfs somehow?
>encrypted swap with zfs-crypto, sounds nice, don't you think? ;)

ZFS already supports swap using zvols.  (zfs create ... -V size volume)

(Swapping the ZFS files is a bad idea because of the never overwrite)

Casper

 
 
 

How to increase write speed to local hard drive?

Post by Casper H.S. Di » Tue, 27 Jun 2006 18:12:52





>>>If you want to forcibly turn write cache off you have to put:

>>>set ata:ata_write_cache = -1

>>>in /etc/system

>> Which has no effect in Solaris 8 or 9 or 10 (except perhaps some
>> updates) or on SPARC.
>Hmm, the following file:
>server:/mnt/platform/i86pc/kernel/drv# what ata
>ata:
>        SunOS 5.10 Generic_118844-26 November 2005
>defines the variable and the function. Maybe they are not connected in the
>source tree buy they definitely are there:
>server:/mnt/platform/i86pc/kernel/drv# nm ata | grep write_cache
>[164]   |     29471|     233|FUNC |LOCL |0    |1      |ata_set_write_cache
>[444]   |       624|       4|OBJT |GLOB |0    |4      |ata_write_cache
>This is extracted from the x86.miniroot (S10U1).

It's not in S10 FCS is was added later (not sure when it was backported).

The reason for this change was fairly simple; generally, all IDE
disks come with write caches enabled so for the most part this
change is a no-op.  It is mostly relevant for a series of IDE disks
Sun shipped with the write cache disabled.

Now, why we thought this was a good change to make is beyond me.

Casper
--
Expressed in this posting are my opinions.  They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.

 
 
 

How to increase write speed to local hard drive?

Post by Rainer Ort » Tue, 27 Jun 2006 19:08:19



Quote:> ZFS already supports swap using zvols.  (zfs create ... -V size volume)

True, but you'll need the fix for CR 6405330, recently integrated into
Nevada.

        Rainer

--
-----------------------------------------------------------------------------
Rainer Orth, Faculty of Technology, Bielefeld University

 
 
 

How to increase write speed to local hard drive?

Post by Stefan Krüge » Wed, 28 Jun 2006 02:22:37




>> or are there plans to intergrate swap handling into zfs somehow?
>> encrypted swap with zfs-crypto, sounds nice, don't you think? ;)

> ZFS already supports swap using zvols.  (zfs create ... -V size volume)

and you can enable/add this with

swap -a /dev/zvol/dsk/pool/swap ?

is this done automagically after a reboot?

Quote:> (Swapping the ZFS files is a bad idea because of the never overwrite)

what does that mean? Swapspace created with (for example)

zfs create -V 1g pool/swap
swap -a /dev/zvol/dsk/pool/swap

is a bad idea?

 
 
 

How to increase write speed to local hard drive?

Post by Casper H.S. Di » Thu, 29 Jun 2006 18:08:54




>> Now, why we thought this was a good change to make is beyond me.
>It's not bad as long as an fsync inserts a write barrier and
>flushes the drive cache.
>But I expect that does not happen, so your point stands.

There is code which does that and it is certainly used by ZFS.
I'm not so sure about the other filesystems.

Casper
--
Expressed in this posting are my opinions.  They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.