Performance on large directory

Performance on large directory

Post by Dan » Wed, 23 Apr 2003 10:01:30



I  am looking for some documentations about performance impact on having
large directory (a directory with 30,000+ files & each file with size from
50k - 150k) on a FreeBSD 4.7

Is that true a hash directory in this case performs much better? How much
better?
File access in this directory will be "ready access" only in most of the
time.

Thanks

Dan

 
 
 

Performance on large directory

Post by Simon Barne » Wed, 23 Apr 2003 20:26:28


Hi Dan,

Quote:> I  am looking for some documentations about performance impact on having
> large directory (a directory with 30,000+ files & each file with size from
> 50k - 150k) on a FreeBSD 4.7

> Is that true a hash directory in this case performs much better? How much
> better?
> File access in this directory will be "ready access" only in most of the
> time.

FreeBSDs filesystem UFS has an extra option called "dir hash" , which will
build in memory hash tables in order to look up directory entries more
quickly.

It has to be compiled into the kernel:
  options         UFS_DIRHASH

There is also a paper on the impact of the dir hash mechanism. It can be
found here: http://www.cnri.dit.ie/Downloads/fsopt.pdf

Cheers,
 Simon

 
 
 

Performance on large directory

Post by Helmut Schellon » Wed, 23 Apr 2003 21:35:18



> Hi Dan,

>>I  am looking for some documentations about performance impact on having
>>large directory (a directory with 30,000+ files & each file with size from
>>50k - 150k) on a FreeBSD 4.7

>>Is that true a hash directory in this case performs much better? How much
>>better?

Directory Operations per second         2781000 3793964 1859000
                                     (OpenUnix8, FreeBSD, Linux)

--
Mit freundlichen Gr?en


http://www.wikiservice.at/dse/wiki.cgi?FreeBSD

 
 
 

Performance on large directory

Post by Vladimir V Egori » Wed, 23 Apr 2003 23:20:10




>> Hi Dan,

>>>I  am looking for some documentations about performance impact on having
>>>large directory (a directory with 30,000+ files & each file with size from
>>>50k - 150k) on a FreeBSD 4.7

>>>Is that true a hash directory in this case performs much better? How much
>>>better?

> Directory Operations per second            2781000 3793964 1859000
>                                      (OpenUnix8, FreeBSD, Linux)

You didn't mention which filesystems are compared.

--
Vladimir

 
 
 

Performance on large directory

Post by Helmut Schellon » Thu, 24 Apr 2003 04:58:56




...
>>>>Is that true a hash directory in this case performs much better? How much
>>>>better?

>>Directory Operations per second         2781000 3793964 1859000
>>                                     (OpenUnix8, FreeBSD, Linux)

> You didn't mention which filesystems are compared.

Default: HTFS, UFS+S, EXT2

http://www.wikiservice.at/dse/wiki.cgi?FreeBSD/BenchMarks

--
Mit freundlichen Gr?en


http://www.wikiservice.at/dse/wiki.cgi?FreeBSD

 
 
 

Performance on large directory

Post by Vladimir V Egori » Thu, 24 Apr 2003 05:04:29





> ...
>>>>>Is that true a hash directory in this case performs much better? How much
>>>>>better?

>>>Directory Operations per second             2781000 3793964 1859000
>>>                                     (OpenUnix8, FreeBSD, Linux)

>> You didn't mention which filesystems are compared.

> Default: HTFS, UFS+S, EXT2

> http://www.wikiservice.at/dse/wiki.cgi?FreeBSD/BenchMarks

Thank you.

--
Vladimir

 
 
 

Performance on large directory

Post by Peter W » Thu, 24 Apr 2003 05:49:32




    Helmut> ...
    >>>>> Is that true a hash directory in this case performs much better? How much
    >>>>> better?
    >>>
    >>> Directory Operations per second                2781000 3793964 1859000
    >>> (OpenUnix8, FreeBSD, Linux)
    >> You didn't mention which filesystems are compared.

    Helmut> Default: HTFS, UFS+S, EXT2
                                  ^^^^<-- Any benchmarking with EXT3?

--
Peter Wu
Powered by Microsoft Windows XP [Version 5.1.2600]

 
 
 

Performance on large directory

Post by Ivan Vora » Thu, 24 Apr 2003 06:42:47



>> Directory Operations per second 2781000 3793964 1859000
>>                               (OpenUnix8, FreeBSD, Linux)

Which benchmark program did you use? (Just curious...)

--
--
You can accomplish anything you set your mind to. The impossible just takes
a little longer.

 
 
 

Performance on large directory

Post by Helmut Schellon » Thu, 24 Apr 2003 07:30:44






>     Helmut> ...
>     >>>>> Is that true a hash directory in this case performs much better? How much
>     >>>>> better?

>     >>> Directory Operations per second           2781000 3793964 1859000
>     >>> (OpenUnix8, FreeBSD, Linux)
>     >> You didn't mention which filesystems are compared.

>     Helmut> Default: HTFS, UFS+S, EXT2
>                                   ^^^^<-- Any benchmarking with EXT3?

First: HTFS-->VXFS

EXT3: No.
Hardware configuration has changed.
Old Test-hd not available.
Because that, comparison not possible.

--
Mit freundlichen Gr?en


http://www.wikiservice.at/dse/wiki.cgi?FreeBSD

 
 
 

Performance on large directory

Post by Helmut Schellon » Thu, 24 Apr 2003 07:33:34




>>>Directory Operations per second 2781000 3793964 1859000
>>>                              (OpenUnix8, FreeBSD, Linux)

> Which benchmark program did you use? (Just curious...)

AIM Benchmark from Caldera; under GPL.
See URL below.

--
Mit freundlichen Gr?en


http://www.wikiservice.at/dse/wiki.cgi?FreeBSD

 
 
 

Performance on large directory

Post by Bill Vermilli » Thu, 24 Apr 2003 11:27:22




>I am looking for some documentations about performance impact on
>having large directory (a directory with 30,000+ files & each
>file with size from 50k - 150k) on a FreeBSD 4.7
>Is that true a hash directory in this case performs much better?
>How much better? File access in this directory will be "ready
>access" only in most of the time.

Beside the other comments you might want to mount that file
system with the  noatime flas.  If it's read only and have lots of
users do you really need to track access time?  It's just another
operation you don't need in most instances.

Bill

--