file descriptors left open in fork()

file descriptors left open in fork()

Post by phil-news-nos.. » Sun, 25 Feb 2001 09:58:34



If a process has many file descriptors open, and forks a child
process for some long term purpose, the child needs to close
those file descriptors it is not using.  If it does not, those
file descriptors effectively stay open when the parent closes
them for real purposes.  For pipes to other child processes,
for example, the pipe will not give an end-of-file.  For network
connections, the connection would not get disconnected from the
side of this machine.

Obviously, what needs to happen right after fork(), in the child
process, is to close all file descriptors that the child does not
need to have open.  But what file descriptors should be closed?

It might seem that the obvious answer is to keep track of all
that are open, and use that list to close them all.  But this
might not always be possible.  Many library functions that do
things involving an open file descriptor held open between calls
could keep the fd number hidden.  Then the main program cannot
really keep track of all the file descriptors that are open for
the child to close.

Another approach is to simply attempt the close() call on every
possible file descriptor that could exist, skipping the ones the
child actually needs.  But to do this correctly, there needs to
be an accurate and portable way to know the whole range.  This
also might use a lot of resources, especially if the range is

Quote:> One thing to watch out for is really high limits.  In the next Solaris
> release, we will have a routine that uses /proc/self/fd to close
> all open fds.  This was done because we want to really increase the limit;
> calling close()  thousands of times is not going to be efficient.

So I can see a critical issue brewing.  I might be tempted to
suggest a new kernel syscall:

close_all_except( ..., -1 );

which would close all file descriptors not listed in the given
list of file descriptors.  Another approach might be to create
a "close on fork" concept, but that would only be useful if
that is the default (else you end up having to set the flag
on all those unknown fds).  I don't think it would be very
easy to ever get a new syscall like that in place as even if
everyone agreed on it today, it would still be years before
we can even count on it being around in most places.  The real
solution will have to be something in the here and now.

I've already seen discussions on how to find the range of all
file descriptors, and can see more than one reason to need to
do this.  For example a program I recently wrote will output
a stream of web log data into multitudes of separate files based
on many factors.  It has to open many files, and in many cases
the files are written to very often.  So it tries to keep the
file descriptors open.  It also tries to close them after some
amount of time, and that amount of time diminishes as the number
that are open approaches the maximum number.  So it needs to
know the maximum number to do this calculation.  As it turns
out, I made this number itself be dynamic, starting at a value
configured in, adjusted very slowly upwards, and adjusted back
down when an open fails for lack of resources (not just out of
file descriptors, but also issues like out of kernel memory).

But at the moment I'm looking at the issue of "left open" file
descriptors in forked child processes, and trying to examine
what would be the best approach to solve it.  In particular,
I am wanting to address it in terms of writing not just the
parent/child forking code, but also in terms of writing library
code to do it for the parent/child, or writing library code
that has to open (and maybe hide) file descriptors for other
purposes, as well as for forking purposes (what the library
does for the main program is make a pipe and fork a child
when the programmer shouldn't even need to be aware that the
library even has and open fd).

Suggestions?
Experiences?
Comments?

--
-----------------------------------------------------------------
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |

-----------------------------------------------------------------

 
 
 

file descriptors left open in fork()

Post by Derek M. Fly » Sun, 25 Feb 2001 15:28:08



> If a process has many file descriptors open, and forks a child
> process for some long term purpose, the child needs to close
> those file descriptors it is not using.  If it does not, those
> file descriptors effectively stay open when the parent closes
> them for real purposes.  For pipes to other child processes,
> for example, the pipe will not give an end-of-file.  For network
> connections, the connection would not get disconnected from the
> side of this machine.

> Obviously, what needs to happen right after fork(), in the child
> process, is to close all file descriptors that the child does not
> need to have open.  But what file descriptors should be closed?

> It might seem that the obvious answer is to keep track of all
> that are open, and use that list to close them all.  But this
> might not always be possible.  Many library functions that do
> things involving an open file descriptor held open between calls
> could keep the fd number hidden.  Then the main program cannot
> really keep track of all the file descriptors that are open for
> the child to close.
...
> So I can see a critical issue brewing.  I might be tempted to
> suggest a new kernel syscall:

> close_all_except( ..., -1 );

This just doesn't work.  You cannot go around closing file descriptors just
because they aren't ones that you recognize.  You will shoot yourself in the
foot.  Example: What do you do about file descriptors used for dynamic
linking?

The reason why you can't find a good way of dealing with this problem is
simple: You're not supposed to care about things like this.  Just make
sure you have resource limits that allow for as at least as many file
descriptors as you need, and close them when you're done with them.  Your
libraries should do the same (and at least set the close-on-exec flag so
your children don't have to deal with fd leaks).

 
 
 

file descriptors left open in fork()

Post by phil-news-nos.. » Mon, 26 Feb 2001 08:21:13



|> So I can see a critical issue brewing.  I might be tempted to
|> suggest a new kernel syscall:
|>
|> close_all_except( ..., -1 );
|
| This just doesn't work.  You cannot go around closing file descriptors just
| because they aren't ones that you recognize.  You will shoot yourself in the
| foot.  Example: What do you do about file descriptors used for dynamic
| linking?

I haven't encountered a system like that, yet.  But I guess there could
be one.

| The reason why you can't find a good way of dealing with this problem is
| simple: You're not supposed to care about things like this.  Just make
| sure you have resource limits that allow for as at least as many file
| descriptors as you need, and close them when you're done with them.  Your
| libraries should do the same (and at least set the close-on-exec flag so
| your children don't have to deal with fd leaks).

This is not an issue about running out of file descriptors.  It is
an issue of contention as a result of sharing file descriptors in
a child process.

Consider that the main process is a special monitoring daemon that
has an HTTP interface.  It receives connections from web browsers
and performs some kinds of services based on data it keeps in real
time.  Periodically it needs to fork a child process to make backups
of the data (because the file I/O can block).  That child will then
inherit every open file descriptor.  When the parent closes one to
conclude that HTTP transaction, the fact that the child still has an
open file descriptor means it won't really close.

--
-----------------------------------------------------------------
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |

-----------------------------------------------------------------

 
 
 

file descriptors left open in fork()

Post by Derek M. Fly » Mon, 26 Feb 2001 09:35:57




> |> So I can see a critical issue brewing.  I might be tempted to
> |> suggest a new kernel syscall:
> |>
> |> close_all_except( ..., -1 );
> |
> | This just doesn't work.  You cannot go around closing file descriptors just
> | because they aren't ones that you recognize.  You will shoot yourself in the
> | foot.  Example: What do you do about file descriptors used for dynamic
> | linking?

> I haven't encountered a system like that, yet.  But I guess there could
> be one.

Every system that I have seen closes the file after mmaping it, but its
just an example of a place where the system might sneak file descriptor
usage in behind your back.

Quote:> | The reason why you can't find a good way of dealing with this problem is
> | simple: You're not supposed to care about things like this.
...
> This is not an issue about running out of file descriptors.  It is
> an issue of contention as a result of sharing file descriptors in
> a child process.
...
> When the parent closes one to
> conclude that HTTP transaction, the fact that the child still has an
> open file descriptor means it won't really close.

Regardless, closing files when you don't understand why its open is a bad
idea.  I would use an fd_set for each of these child roles, and as you open
files in the parent, do an FD_SET for that file descriptor if a particular
child role should close it.  Then in the child, loop from 0 to
getrlimit (RLIMIT_NOFILE,...) - 1 and if FD_ISSET for that file descriptor
in the fd_set for this particular child role, close it.  If your system has
a crappy select implementation (one that can't handle very many files),
write your own macros --- they just manipulate a bitmask in an array of longs.
 
 
 

file descriptors left open in fork()

Post by phil-news-nos.. » Mon, 26 Feb 2001 14:24:01




|>
|> |> So I can see a critical issue brewing.  I might be tempted to
|> |> suggest a new kernel syscall:
|> |>
|> |> close_all_except( ..., -1 );
|> |
|> | This just doesn't work.  You cannot go around closing file descriptors just
|> | because they aren't ones that you recognize.  You will shoot yourself in the
|> | foot.  Example: What do you do about file descriptors used for dynamic
|> | linking?
|>
|> I haven't encountered a system like that, yet.  But I guess there could
|> be one.
|
| Every system that I have seen closes the file after mmaping it, but its
| just an example of a place where the system might sneak file descriptor
| usage in behind your back.

The child process doesn't need to have the file descriptor open.
In this case it probably won't hurt to leave it open, either.
Supposedly if some library opened it for some reason and uses it
to perform functions that get called, this can be a problem.  But
in the case I am programming right now, the child process is not
calling anything but POSIX syscalls, and my own library.  So maybe
a general solution is bad.  But perhaps in my case no problem?

|> | The reason why you can't find a good way of dealing with this problem is
|> | simple: You're not supposed to care about things like this.
| ...
|> This is not an issue about running out of file descriptors.  It is
|> an issue of contention as a result of sharing file descriptors in
|> a child process.
| ...
|> When the parent closes one to
|> conclude that HTTP transaction, the fact that the child still has an
|> open file descriptor means it won't really close.
|
| Regardless, closing files when you don't understand why its open is a bad
| idea.  I would use an fd_set for each of these child roles, and as you open
| files in the parent, do an FD_SET for that file descriptor if a particular
| child role should close it.  Then in the child, loop from 0 to
| getrlimit (RLIMIT_NOFILE,...) - 1 and if FD_ISSET for that file descriptor
| in the fd_set for this particular child role, close it.  If your system has
| a crappy select implementation (one that can't handle very many files),
| write your own macros --- they just manipulate a bitmask in an array of longs.

I happen to be using poll().  But I take this to mean you're saying
that I should just keep track of what file descriptors the main program
opens, let the child close those and leave any that any library functions
open alone.  And when this results in hung sessions due to extra reference
counts on open files held open by long running child processes that did
not know about a file descriptor to open, I should just tell them it is
something they should not worry about.

It sure seems to me that this is a fundamental flaw in the Unix/POSIX API.

Even if there was a feature like being able to set close-on-fork on a
file descriptor, a library implementor wouldn't know whether the caller
is going to fork a child to actually use the object which has the hidden
open file descriptor, or is going to fork a child that has nothing to do
with it and can very easily cause interference by virtue of leaving the
file descriptor open.

The fundamental flaw I'm referring to is that there is a conflict between
the notion of data hiding in an object oriented design, and forking many
processes which by inheriting the hidden resources, have an impact.  Fork
works fine for memory for the most part because there is copy-on-write to
deal with collisions of usage (the stack is probably the first that will
happen).  But a collision in a resource like a file descriptor can be
more severe.  It can do things like leave files open forever, or leave
network connections hung, with no _clean_ way to deal with it.

A library to abstract network connections so that the caller does not have
to worry about the details basically seems to have a fundamental problem
of conflict with multiple processes.  Mixing two more libraries could be
quite hopeless.

Now I do see one possibility for a library *I* write.  I could add a
function call to the library to specifically be used in forked child
processes to do cleanups appropriate to a child process which is not
going to use resources managed by the library.  If the child is going
to use some, but not all, of the resources, a function call to indicate
that would be appropriate.  Likewise, function calls for the parent to
indicate what is left to the child is no longer handled by the parent.

I think that makes the library API uglier, but the problem is a lot of
libraries out there now don't have any such means at all.

--
-----------------------------------------------------------------
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |

-----------------------------------------------------------------

 
 
 

file descriptors left open in fork()

Post by Derek M. Fly » Mon, 26 Feb 2001 14:48:52



> | Regardless, closing files when you don't understand why its open is a bad
> | idea.  I would use an fd_set for each of these child roles, and as you open
> | files in the parent, do an FD_SET for that file descriptor if a particular
> | child role should close it.  Then in the child, loop from 0 to
> | getrlimit (RLIMIT_NOFILE,...) - 1 and if FD_ISSET for that file descriptor
> | in the fd_set for this particular child role, close it.  If your system has
> | a crappy select implementation (one that can't handle very many files),
> | write your own macros --- they just manipulate a bitmask in an array of
> | longs.

> I happen to be using poll().  But I take this to mean you're saying
> that I should just keep track of what file descriptors the main program
> opens, let the child close those and leave any that any library functions
> open alone.  And when this results in hung sessions due to extra reference
> counts on open files held open by long running child processes that did
> not know about a file descriptor to open, I should just tell them it is
> something they should not worry about.

You don't need to be using select(), you could just use its convenience
macros to manage an internal list of which file descriptors should be closed
by a child.  You could easily do it with an array (as you suggest), but the
bitmask in an array of longs that select manipulates via a fd_set is extremely
efficient.

Quote:> It sure seems to me that this is a fundamental flaw in the Unix/POSIX API.

The program can and should manage these particulars itself.  If you do so,
you can easily handle the problem.  Its the same with dynamically allocated
memory.  You don't try to free everything from the heap --- though it might
seem perfectly logical after forking if you don't need any of the parents
dynamic memory --- other modules (your libraries or other parts of the system)
may use some of it.  For the same reason, you shouldn't try to close all the
open files.  Other modules may be managing those resources on their own.
 
 
 

file descriptors left open in fork()

Post by Casper H.S. Dik - Network Security Engine » Mon, 26 Feb 2001 19:48:22


[[ PLEASE DON'T SEND ME EMAIL COPIES OF POSTINGS ]]


>I haven't encountered a system like that, yet.  But I guess there could
>be one.

Solaris at various times used a cached /dev/zero fd both for mapping
thread stacks and even one for the runtime linker.
The runtime linker was mostly fine, but the thread library did have
problems with people closing fds.  We since added MAP_ANON and no
longer require open("/dev/zero") .  THe caaching of fds was gotten
rid of before that.

There are valid reasons to close all fds; e.g., if you really don't
want to inherit and (you're a daemon and don't care).

In most cases, though, the "close all" stuff performed by shells
and such at statup serves no purpose.  (Other than causing more bugs)

Casper
--
Expressed in this posting are my opinions.  They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.

 
 
 

file descriptors left open in fork()

Post by phil-news-nos.. » Tue, 27 Feb 2001 11:38:49



| Solaris at various times used a cached /dev/zero fd both for mapping
| thread stacks and even one for the runtime linker.
| The runtime linker was mostly fine, but the thread library did have
| problems with people closing fds.  We since added MAP_ANON and no
| longer require open("/dev/zero") .  THe caaching of fds was gotten
| rid of before that.
|
| There are valid reasons to close all fds; e.g., if you really don't
| want to inherit and (you're a daemon and don't care).
|
| In most cases, though, the "close all" stuff performed by shells
| and such at statup serves no purpose.  (Other than causing more bugs)

So the dilemma is that closing fds can cause problems and leaving
them open can cause problems, when a forked child does this.  This
seems to tell me that hiding fds in libraries and objects is a bad
idea because processes need to know what is safe to close and/or
what needs to be left open.

--
-----------------------------------------------------------------
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |

-----------------------------------------------------------------

 
 
 

file descriptors left open in fork()

Post by phil-news-nos.. » Tue, 27 Feb 2001 11:42:52



|> | Regardless, closing files when you don't understand why its open is a bad
|> | idea.  I would use an fd_set for each of these child roles, and as you open
|> | files in the parent, do an FD_SET for that file descriptor if a particular
|> | child role should close it.  Then in the child, loop from 0 to
|> | getrlimit (RLIMIT_NOFILE,...) - 1 and if FD_ISSET for that file descriptor
|> | in the fd_set for this particular child role, close it.  If your system has
|> | a crappy select implementation (one that can't handle very many files),
|> | write your own macros --- they just manipulate a bitmask in an array of
|> | longs.
|>
|> I happen to be using poll().  But I take this to mean you're saying
|> that I should just keep track of what file descriptors the main program
|> opens, let the child close those and leave any that any library functions
|> open alone.  And when this results in hung sessions due to extra reference
|> counts on open files held open by long running child processes that did
|> not know about a file descriptor to open, I should just tell them it is
|> something they should not worry about.

| You don't need to be using select(), you could just use its convenience
| macros to manage an internal list of which file descriptors should be closed
| by a child.  You could easily do it with an array (as you suggest), but the
| bitmask in an array of longs that select manipulates via a fd_set is extremely
| efficient.

Keeping a list of open fds is not hard to do at all.  Finding out what fds
to put in the list is, when a library hides them.

|> It sure seems to me that this is a fundamental flaw in the Unix/POSIX API.
|
| The program can and should manage these particulars itself.  If you do so,
| you can easily handle the problem.  Its the same with dynamically allocated
| memory.  You don't try to free everything from the heap --- though it might
| seem perfectly logical after forking if you don't need any of the parents
| dynamic memory --- other modules (your libraries or other parts of the system)
| may use some of it.  For the same reason, you shouldn't try to close all the
| open files.  Other modules may be managing those resources on their own.

But this also rules out libraries hiding open fds.  Or more generally,
any time a resource is hidden and that resource can have an impact in
a forked process that inherits it, this makes a mess, especiall when the
inheritance of it causes problems, as it does in some cases of fds, where
close() in the parent will no longer really close the object.

--
-----------------------------------------------------------------
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |

-----------------------------------------------------------------

 
 
 

1. Apache 1.3.3 log file rotation leaves open file descriptors

In Apache 1.3.3 (Solaris) we are using the USR1 signal (or HUP) to rotate the
log files daily.

Our rotation strategy is:
1) rename log file
2) issue USR1

What appears to be happening though, is that Apache is not closing the file
descriptors for the old log files. (We eventually run out of file descriptors -
in just over a month)  and lsof shows that Apache still has all the old log
files open.

Anyone else noticed this behaviour?

Thanks,
Mark Hume

2. Printing from Netscape 2.01

3. open files/per-process file descriptor

4. CERN Authentication Setup Question

5. rm-ing files with open file descriptors

6. IPX: Network number collision - whose problem?

7. DEBUG: warning: few usable file descriptors left

8. Slices difference?

9. Debug Warning: few usable file descriptors left

10. UNIX system call: how to find how many file descriptors left

11. Does a fork then exec process inherit valid socket file descriptor?

12. fork() and file descriptors

13. fork + file descriptor