Centuries ago, Nostradamus foresaw a time when Lincoln Yeoh would say:
>I've heard that ext2 fs becomes less efficient if there are tons of files
>in a directory.
>OK what if I have lots of files. How should they be split?
>Or by 200s? 500s? or 1000s?
>Basically how much time does it take to change one directory level, vs scan
>through 100 files. How flat should the "pyramid" be.
>I'll probably consider other file systems in the future (they have to be
>fast, cheap, reliable, robust and SMP safe). But meanwhile I'm sticking
I think I'd go with the 100 option.
It has the merit that you can go into the directory, type "ls," and
get a list of files/directories that is not so large that it has to occupy
A couple other thoughts:
a) Use leading zeros so that these encoded filenames are of uniform
For instance, /opt/d00, /opt/d02, ... /opt/d98, /opt/d99
Uniform lengths means that you can do matches via more specific
expressions that can be safer and possibly faster.
"ls /opt/d[0-9] /opt/d[0-9][0-9]"
is not as good as
b) If this stuff is cryptic, there's no merit in having long filenames.
/opt/d00/f210 is more compact than /opt/d00/file210, and is no less
c) Be prepared to do a benchmark based on using 100 files/directory
as well as 1000 files/directory. That's likely the most relevant
comparison. You're not likely to see *great* benefit in moving from
100 files/directory to some "perfect sweet spot" of 345/directory.
d) Consider using hexadecimal values in the encoding, or, if you want
"several hundred" files per directory, the option of transforming to
"base 36," where you combine the 10 digits 0..9 with the 26 letters a..z
to provide you [10+26] * [10+26] or 1296 as the limit in two characters.
Small filenames are going to be more efficient to work with both in your
code and within the kernel's support for the filesystem.
--Kill Running Inferiors--