limit to number of files in a directory?

limit to number of files in a directory?

Post by Morga » Thu, 22 Jan 1998 04:00:00



MS-DOS and Windows have practical limits (maybe absolute limits too) on the
number of files in a single subdirectory. In DOS, if you have more than a
few thousand files, file access crawls.

Does Unix have similar limitations?

The real-world problem I'm trying to solve is that I have about 10,000 small
files, between 2 and 20K in size. Each file contains ASCII text data.

I have a Perl script which searches through these files for particular
keywords.

I'm wondering if it would be better to have all these files in a single
subdirectory, or multiple subdirectories.

The files must remain separated - I can't combine them into a single file.

I'm using Solaris 2.5 and BSDI.

 
 
 

limit to number of files in a directory?

Post by Alexander Johannese » Thu, 22 Jan 1998 04:00:00


[snipped a bit]

Quote:> I'm wondering if it would be better to have all these files in a single
> subdirectory, or multiple subdirectories.
> The files must remain separated - I can't combine them into a single

file.

How are these files named? One of my systems has got more than
50.000 files, each put in various directories by some
sort-criterias. Can your files _be_ seperated? What is in
them?

In DOS, the system will slow down _dramatically_ if you've
got more than 200 files in 1 sub-directory. I don't think
UNIX has the same thing, although, in the end, any system
would crawl under millions of files ... :)

Alexander

--
________________________________________________________

  Life is not a mystery to solve, but a puzzle to play
________________________________________________________


         http://home.powertech.no/alexjo/

 
 
 

limit to number of files in a directory?

Post by Chris Water » Thu, 22 Jan 1998 04:00:00




> [snipped a bit]
> > I'm wondering if it would be better to have all these files in a single
> > subdirectory, or multiple subdirectories.
> > The files must remain separated - I can't combine them into a single
> file.

[...]

Quote:> In DOS, the system will slow down _dramatically_ if you've
> got more than 200 files in 1 sub-directory. I don't think
> UNIX has the same thing, although, in the end, any system
> would crawl under millions of files ... :)

I've never seen the effect you mention in DOS.  There is an absolute
limit on how many files/subdirs you can have in the *root* of a FAT
system, and I think it's about 256.  But that's all beside the point.

It helps to think of a directory as a file.  Which it is in Unix
(older versions even let you open and read directories *as* files),
and which it is for all practical purposes even in DOS.  And the
information in the directory-file is unordered.  Which means that you
will have to perform a linear search to match a filename in the
directory when you open a (regular) file.  Thus, your performance
issues will involve seek time and search time.  And they can be very
real -- I know of an ISP which had major problems when their user base
expanded, and the search times for /var/spool/mail became too slow to
keep up with the volume of incoming mail.  Just doing a simple
"ls /var/spool/mail/username" took over a minute!

It's probably safer to use multiple subdirectories, if you suspect
that there may be filename lookup time issues.  It's unlikely to hurt,
and may well help.  It uses a little more disk space, but it's
probably going to be worth it.
--
Chris Waters             | The real problem with the the year 2000 is



 
 
 

limit to number of files in a directory?

Post by David William » Fri, 23 Jan 1998 04:00:00




Quote:>MS-DOS and Windows have practical limits (maybe absolute limits too) on the
>number of files in a single subdirectory. In DOS, if you have more than a
>few thousand files, file access crawls.

>Does Unix have similar limitations?

>The real-world problem I'm trying to solve is that I have about 10,000 small
>files, between 2 and 20K in size. Each file contains ASCII text data.

  Performance does crawl eventually as indirect pointers and even double
  indirect pointers start getting used. On a directory with 40,000 files
  an ls -l > /tmp/myfile takes >1 minute!!

  Try what terminfo does to get around this:-

  ../a/a1
  ../a/a2
  ../b/b2
  ../c/c2

   etc.

Quote:>I have a Perl script which searches through these files for particular
>keywords.

>I'm wondering if it would be better to have all these files in a single
>subdirectory, or multiple subdirectories.

>The files must remain separated - I can't combine them into a single file.

>I'm using Solaris 2.5 and BSDI.

--
David Williams

Maintainer of the Informix FAQ
 Primary site (Beta Version)  http://www.smooth1.demon.co.uk
 Official site                http://www.iiug.org/techinfo/faq/faq_top.html

I see you standin', Standin' on your own, It's such a lonely place for you, For
you to be If you need a shoulder, Or if you need a friend, I'll be here
standing, Until the bitter end...
So don't chastise me Or think I, I mean you harm...
All I ever wanted Was for you To know that I care

 
 
 

limit to number of files in a directory?

Post by Doug Siebe » Fri, 23 Jan 1998 04:00:00



>  Performance does crawl eventually as indirect pointers and even double
>  indirect pointers start getting used. On a directory with 40,000 files
>  an ls -l > /tmp/myfile takes >1 minute!!

That crawls because the "-l" argument to ls causes a stat() of each and
every file in that directory.  No matter how you slice it, 40,000 stat()
system calls is gonna be painful.  Also remember that by default ls will
sort the directory itself, which will get a little slower as you increase
the size (be glad they aren't using a bubble sort ;) )  If you did "ls -f"
I'll bet it'd be pretty snappy, no sorting or stat()ing required.

The big slowdown for Unix (most Unixes, at least) is because the directory
is a linear flat file.  If you compare a program trying to do an open() of
a nonexistent file in a directory with 40K files versus 40 files, you'll
find there is a huge increase in the time required (do it in a loop so you
can see it better)  A few Unixes use a B-tree for directory layout, this
helps immensely and the penalties for this are quite small (but doing an
"ls -l" is still going to be painful, as you still do 40,000 stat() calls
in the big directory)

Anyone know offhand which major Unixes use B-trees for the directory (and in
which filesystems, since most have multiple filesystem types available now)

--
Douglas Siebert                Director of Computing Facilities

If you let the system beat you long enough, eventually it'll get tired.

 
 
 

limit to number of files in a directory?

Post by Elias Martenso » Fri, 23 Jan 1998 04:00:00



> Anyone know offhand which major Unixes use B-trees for the directory (and in
> which filesystems, since most have multiple filesystem types available now)

Veritas has a product called "Veritas File System", or "VxFS" for
short, that hashes directory entries. Really fast.

--
Elias Martenson
elias.martenson (atsign) sweden.sun.com

 
 
 

limit to number of files in a directory?

Post by Doug Reilan » Fri, 23 Jan 1998 04:00:00



> MS-DOS and Windows have practical limits (maybe absolute limits too) on the
> number of files in a single subdirectory. In DOS, if you have more than a
> few thousand files, file access crawls.

> Does Unix have similar limitations?

> The real-world problem I'm trying to solve is that I have about 10,000 small
> files, between 2 and 20K in size. Each file contains ASCII text data.

> I have a Perl script which searches through these files for particular
> keywords.

> I'm wondering if it would be better to have all these files in a single
> subdirectory, or multiple subdirectories.

> The files must remain separated - I can't combine them into a single file.

> I'm using Solaris 2.5 and BSDI.

This really depends on the filesystem. I have seen different behavior
using different filesystem types (ufs, vxfs, s5) in the same OS .
However, it does not take alot of effort to think of senerios where
having everything dumped into one directory would be less efficient. I
have no knowledge of Solaris 2.5 and BSDI, but if I was implementing
this, I would use different subdirs.

Good Luck,
Doug

 
 
 

limit to number of files in a directory?

Post by Richard Tob » Sat, 24 Jan 1998 04:00:00



>The big slowdown for Unix (most Unixes, at least) is because the directory
>is a linear flat file.

4.3BSD introduced an optimisation for the common case of accessing the
files in a directory sequentially: instead of starting from the top,
lookup picks up from where it left off last time.  Of course, it doesn't
help unless you're going through the directory in order.

-- Richard
--
Because of all the junk e-mail I receive, all e-mail from .com sites is
automatically sent to a file which I only rarely check.  If you want to mail
me from a .com site, please ensure my surname appears in the headers.

 
 
 

limit to number of files in a directory?

Post by Villy Kru » Sat, 24 Jan 1998 04:00:00



>MS-DOS and Windows have practical limits (maybe absolute limits too) on the
>number of files in a single subdirectory. In DOS, if you have more than a
>few thousand files, file access crawls.
>Does Unix have similar limitations?
>The real-world problem I'm trying to solve is that I have about 10,000 small
>files, between 2 and 20K in size. Each file contains ASCII text data.
>I have a Perl script which searches through these files for particular
>keywords.
>I'm wondering if it would be better to have all these files in a single
>subdirectory, or multiple subdirectories.
>The files must remain separated - I can't combine them into a single file.
>I'm using Solaris 2.5 and BSDI.

The limit is the number of inodes you allocated space for when you created
the file system.  This number is the total number of files on the entire
file system.  The directories themselves has no limit as such; it just
gets slow.

This, of course is for traditional unix.  Others might have implemented
somthing different.

Villy

 
 
 

limit to number of files in a directory?

Post by Bill Vermilli » Mon, 26 Jan 1998 04:00:00





>[snipped a bit]
>> I'm wondering if it would be better to have all these files in a single
>> subdirectory, or multiple subdirectories.
>> The files must remain separated - I can't combine them into a single
>file.
>How are these files named? One of my systems has got more than
>50.000 files, each put in various directories by some
>sort-criterias. Can your files _be_ seperated? What is in
>them?
>In DOS, the system will slow down _dramatically_ if you've
>got more than 200 files in 1 sub-directory. I don't think
>UNIX has the same thing, although, in the end, any system
>would crawl under millions of files ... :)

Not neccesarily true.   IRIX 6.X has been tested with 16million
files in one directory with no significant performance hits.

The older Unix file systems will start showing performance hits
from a few hundred to a few thousand, depending on the FS
implemetaiton.   Those FS'es are becoming more obsolete with each
passing day.

--

(Remove the anti-spam section from the address on a mail reply)

 
 
 

limit to number of files in a directory?

Post by Bill Vermilli » Mon, 26 Jan 1998 04:00:00




>The limit is the number of inodes you allocated space for when you created
>the file system.  This number is the total number of files on the entire
>file system.  The directories themselves has no limit as such; it just
>gets slow.

Some of the more modern Unix implementations are dynamic with
regards to inodes.   You are not limited to the amount set upon
file creation.

--

(Remove the anti-spam section from the address on a mail reply)

 
 
 

limit to number of files in a directory?

Post by Kuntal M. Daftar » Tue, 27 Jan 1998 04:00:00



> >The limit is the number of inodes you allocated space for when you created
> >the file system.  This number is the total number of files on the entire
> >file system.  The directories themselves has no limit as such; it just
> >gets slow.

        does anyone know an approx figure when solaris2.5.1 on sparc ultra 1
        starts crawling?
 
 
 

limit to number of files in a directory?

Post by Robert S. Sciu » Thu, 29 Jan 1998 04:00:00




>    does anyone know an approx figure when solaris2.5.1 on sparc ultra 1
>    starts crawling?

I believe that Unix inodes are referenced sequentially in directory
searches ... big directories mean long search times, slow opens.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Robert S. Sciuk         1032 Howard Rd.                 Ph:905 632-2466
Control-Q Research      Burlington, Ont. Canada         Fx:905 632-7417

 
 
 

limit to number of files in a directory?

Post by Jon LaBad » Fri, 30 Jan 1998 04:00:00




|>

|> >
|> >      does anyone know an approx figure when solaris2.5.1 on sparc ultra 1
|> >      starts crawling?
|>
|> I believe that Unix inodes are referenced sequentially in directory
|> searches ... big directories mean long search times, slow opens.

Back in the bad old day when directory entries were fixed size
(16 bytes) and the file system was s5 rather than ufs, the rule of
thumb was for fastest access keep the directory in one block.
But if you stayed within the 10 direct block of the directory
inode access was not too bad (given linear search).  When indirect
blocks were accessed in the directory things went bad.

Given a 2K block size this would be:

  128 entries in the first directory block
  1280 entried in the direct blocks

No idea how this relates to the current variable directory entry
size and ufs.

jl
--


 4455 Province Line Road        (609) 252-0159
 Princeton, NJ  08540-4322      (609) 683-7220 (fax)

 
 
 

limit to number of files in a directory?

Post by Jon Andre » Thu, 05 Feb 1998 04:00:00




:
: > >file system.  The directories themselves has no limit as such; it just
: > >gets slow.
:
:       does anyone know an approx figure when solaris2.5.1 on sparc ultra 1
:       starts crawling?
:
I ran some tests on a Pyramid NILE with 2Gb RAM and 20 cpus, I seem to
remember file lookups through the directory started to become unacceptable
above 10,000 files.
--
Regards
-Jon-

  Jon Andrews                                   phone: +44 171 888 4189
  DTS Unix Systems Engineering                  fax:   +44 171 888 3924
  Credit Suisse First Boston                                      


 
 
 

1. Is there a limit to the number of files in 1 directory?

Is there a limit to the number of files you can have in 1 directory in
Linux.

I'm approaching over 50,000 files in 1 directory.

--

|---------------------------------------------------|
| Anti-Spam - Please reply to address below          |

|---------------------------------------------------|

2. KDE3: desktop icon tooltips

3. Realistic limit on number of files in a directory

4. Frame Grabbers

5. File number limits per directory(Solaris8)

6. xdm broken after power failure

7. "Performance" limit to file numbers in a directory

8. Wants to run DHCP

9. Limit on number of files in a directory ?

10. Limit on number of directories under Solaris 2.5

11. How to limit the number of users for a directory with APACHE ?

12. FILE LIMITS AIX 4.3 limits file, original config etc

13. number of files in a dir limit