ext2 and fragmentation

ext2 and fragmentation

Post by Peter Kovac » Mon, 19 Oct 1998 04:00:00



I remember having heard somewhere that the ext2 file system uses an
allocation algorithm wich tries to avoid fragmention. Is it true?? If so,
how does it come that the file system checker (fck2ext ?? or whatever is the
standard utility) reports a fragmentation ratio of above 20%?

Thanks
Peter

 
 
 

ext2 and fragmentation

Post by Paul Flinder » Mon, 19 Oct 1998 04:00:00



> I remember having heard somewhere that the ext2 file system uses an
> allocation algorithm wich tries to avoid fragmention. Is it true?? If so,
> how does it come that the file system checker (fck2ext ?? or whatever is the
> standard utility) reports a fragmentation ratio of above 20%?

I believe that ext2fs uses an allocation strategy similar to the BSD fast
filing system (from which quite a few *nix file systems draw their
inspiration). The following description is from memory of how the BSD
scheme works - chances are it's wrong (I haven't read Leffler in years) and
it may not accurately describe ext2fs even if it were correct. Hopefully it
will be close enough to give you a rough idea (and also I expect someone
will be along to correct me if I'm too wide of the mark).

The file system is split into a number of block groups. Each block group
has space of some inodes and some file data blocks.

New directories are allocated in the block group with most space. Files are
initially allocated in the same block group as their directory and in the
biggest chunk of free space in the block. Blocks are allocated in that
block group until some maximum and then, if the file grows larger than that
figure blocks are allocated from another block group. The idea is to avoid
a single file "hogging" all of a block group.

This scheme resists fragmentation because it chooses the largest free block
wheninitially placing the file but it also tends to keep a directory, it's
files and their inodes close together on the disk (within a single block
group). In turn this means that when fragmentation does occur (which it
will, eventually as there's no guarentee that the free space that was used
for the first few blocks of the file will be able to hold the whole file,
especially if the disk is nearly full) the blocks used by a given file will
tend to be close together unless the file is large. In that case the file
will be spread out over the disk but in clumps - you will be able to read a
lot of data before moving the disk head very far so the cost of the big
seeks adds little to the time taken to read all of the file's data.

To summarise ext2fs resists fragmentation but can't prevent it entirely.
Perhaps more importantly the layout of files on disk is such that even when
it does occur the effects of fragmentation are reduced.

As for your 20% fragmentation either
  You have a few large files (which will always be spread across block
  groups and therefore always fragmented)

  You've been running the filesystem nearly full with a high rate of
  creation and deletion of files

however it shouldn't be hurting the disk performance as much as the same
degree of fragmentation on a (V)FAT file system.

If you want there is a defragger for ext2fs (or you could try the time
honoured method of copying all the files to another partition and then back
again - you can use cp -a or tar for this).

--
Paul

 
 
 

ext2 and fragmentation

Post by Daniel Kiracof » Mon, 19 Oct 1998 04:00:00


Quote:> > I remember having heard somewhere that the ext2 file system uses an
> > allocation algorithm wich tries to avoid fragmention. Is it true?? If so,
> > how does it come that the file system checker (fck2ext ?? or whatever is the
> > standard utility) reports a fragmentation ratio of above 20%?

> I believe that ext2fs uses an allocation strategy similar to the BSD fast
> filing system (from which quite a few *nix file systems draw their
> inspiration). The following description is from memory of how the BSD
> scheme works - chances are it's wrong (I haven't read Leffler in years) and
> it may not accurately describe ext2fs even if it were correct. Hopefully it
> will be close enough to give you a rough idea (and also I expect someone
> will be along to correct me if I'm too wide of the mark).

> The file system is split into a number of block groups. Each block group
> has space of some inodes and some file data blocks.

> New directories are allocated in the block group with most space. Files are
> initially allocated in the same block group as their directory and in the
> biggest chunk of free space in the block. Blocks are allocated in that
> block group until some maximum and then, if the file grows larger than that
> figure blocks are allocated from another block group. The idea is to avoid
> a single file "hogging" all of a block group.

 You forgot the most important part: preallocation.  When a block is
allocated (within the same block group if possible) the next 8 blocks
are also reserved (if they're free). This way data blocks are clustered
together in areas even small than a block group (which is pretty
large).  
When the file is closed, any reserved but unallocated blocks are
released. Preallocation fails when the filesystem is very very full.

--
/* Daniel */
WWW: http://users.gurulink.com/drk

"Fear is only afraid of the absence of itself" - Mediocrates

 
 
 

ext2 and fragmentation

Post by Kristian Koehnto » Mon, 19 Oct 1998 04:00:00



>New directories are allocated in the block group with most space. Files are
>initially allocated in the same block group as their directory and in the
>biggest chunk of free space in the block. Blocks are allocated in that

^^^^^^^^
  first

Quote:>block group until some maximum and then, if the file grows larger than that
>figure blocks are allocated from another block group.

  no maximum in ext2 and no "other" block group, but "next" instead.

Quote:>In that case the file
>will be spread out over the disk but in clumps - you will be able to read a
>lot of data before moving the disk head very far so the cost of the big
>seeks adds little to the time taken to read all of the file's data.

That is the central idea and it is true for ext2 as well as for
BSD. While fsck reports all "fragments", fragments above a
certain size are of little interest for the overall throughput
and the individual readers performance. Depending on your
hardware, the limit should be somewhere between 16 and 32 blocks
a piece.

Quote:>To summarise ext2fs resists fragmentation but can't prevent it entirely.

You cannot avoid fragmentation. No filesystem can do this. So
you could at least try and go for large fragments, which is what
both ext2 and BSD do.

Kristian
--
"Ich war schon immer der Meinung, dass es voellig korrekt ist, Metronet einen
 Provider zu nennen. Damit bringt man naemlich zum Ausdruck, dass Metronet zu
 einem Internet Service Provider zwei Dinge fehlen:
 das Internet und der Service." -- Adrian Suter