fork and vfork

fork and vfork

Post by Ron Wei » Sat, 09 Dec 1995 04:00:00



I'm working on a large system running under HP-UX 10.xx (on 9000/700
series machines).We need to spawn processes from a main application. This
main application is huge (image size currently approx. 40MB and growing -
don't know off hand the split between text, BSS etc). To spawn the
processes we must do the standard "fork"-"exec.." call sequence. The
fork() system call creates "an exact copy of the calling process"
(quote from the HP-UX 10.01 fork(2) man entry).

Can anyone tell me what exactly is copied by the fork() call? I know that
there must be something like a "process control block" created for the
child (it will mostly be a copy of the parent's), but is the program text
and data actually copied? In a virtual memory system like HP-UX 10.xx is
any significant amount of memory actually allocated and copied (apart
from, say, the one page containing the program counter at the time of the
fork)? My concern here is that there may be a large CPU cycle and/or
memory overhead in forking a process this large.

The man entry for vfork(2) indicates that this call *may* be more
efficient, but the implementation is free to treat this call as being
identical to fork. In the case of HP-UX 10.xx (xx=01 or 1) is vfork "a
higher performance version of fork()"?


(not the address on this posting's header) as well as a posting (in case
others are interested).

Thanks very much,

Ron Weiss

 
 
 

fork and vfork

Post by Ross Hayd » Sun, 10 Dec 1995 04:00:00




>Can anyone tell me what exactly is copied by the fork() call? I know that
>fork)? My concern here is that there may be a large CPU cycle and/or
>memory overhead in forking a process this large.

Modern UNIX systems (HP-UX included) implement fork() using a copy-on-
write address space copying scheme, whereby, after the call to fork(),
the data and stack pages of the parent are marked read only/copy-on-write.
The parent and child execute the same pages of memory until one of them
issues a write.  When a write is attempted to any of these pages, an
exception occurs and a writeable (sp?) copy of the page is created.

What this means is that the overhead associated with copying an entire
address space at fork() time is lifted. (Although fork() is still an
expensive system call because of context switches, but this expense is
linear and not dependent on the size of the address space of the process.)

Quote:>The man entry for vfork(2) indicates that this call *may* be more
>efficient, but the implementation is free to treat this call as being
>identical to fork. In the case of HP-UX 10.xx (xx=01 or 1) is vfork "a
>higher performance version of fork()"?

The vfork() call tradionally is faster even in COW implementations,
because address maps do not have to be copied.

But I'm not sure what is meant by the statement, "free to treat this call
as being identical fork."  See, after a vfork(), the child first runs
before the parent, but in the address space of the parent.  The parent is
blocked until the child returns.  This is important to note.  The vfork()
call should only be used when then child plans to call an exec function
or return very soon after the vfork so that the parent can get back down
to business.

Perhaps what is meant by that statement is that, since vfork() is not
POSIX, some implementations still provide the function but only as a front
end to fork (this is only a guess, and a bad one at that!).

-ross-

 
 
 

fork and vfork

Post by John S. Dys » Mon, 11 Dec 1995 04:00:00




>Perhaps what is meant by that statement is that, since vfork() is not
>POSIX, some implementations still provide the function but only as a front
>end to fork (this is only a guess, and a bad one at that!).

On the *BSDs (Net, Free, Open, etc.), vfork is just a blocking version of
fork.  I have been thinking about making it share the address space on
FreeBSD -- it might save some time on fork/execs, but since vfork is somewhat
more and more in disuse, it probably wont have a significant payback.

John

 
 
 

fork and vfork

Post by Logan Sh » Mon, 11 Dec 1995 04:00:00




>I'm working on a large system running under HP-UX 10.xx (on 9000/700
>series machines).We need to spawn processes from a main application. This
>main application is huge (image size currently approx. 40MB and growing -
>don't know off hand the split between text, BSS etc). To spawn the
>processes we must do the standard "fork"-"exec.." call sequence. The
>fork() system call creates "an exact copy of the calling process"
>(quote from the HP-UX 10.01 fork(2) man entry).

>Can anyone tell me what exactly is copied by the fork() call? I know that
>there must be something like a "process control block" created for the
>child (it will mostly be a copy of the parent's), but is the program text
>and data actually copied? In a virtual memory system like HP-UX 10.xx is
>any significant amount of memory actually allocated and copied (apart
>from, say, the one page containing the program counter at the time of the
>fork)? My concern here is that there may be a large CPU cycle and/or
>memory overhead in forking a process this large.

From what I understand (I am no kernel designer), most modern versions
of Unix have a "copy-on-write" feature to solve exactly this kind of
problem.  When a process does a fork(), the memory allocated to the
process is not literally copied.  Instead, the parent and child
processes each share page of memory until one of them attempts to write
to that page, at which point the write is suspended, the page is
copied, each process is given its own separate copy, and then the
memory write is completed for the appropriate page.

What this means is that, even when large processes fork(), potentially
only a very small number of pages actually need to get copied.  This
saves on all kinds of things, including swapping to disk.  To quote a
guy I used to work with named Rob, "Copy-on-write is your friend."

Presumably, though, copy-on-write doesn't remove all the overhead of
doing a fork() -- the kernel still must update its memory management
data for the new process, and presumably that takes more time for
processes that have allocated more memory.

Regarding the differences between vfork() and fork():  In a system that
has the copy-on-write feature, a traditional vfork() is probably not a
bit efficiency gain over fork(), so the implementers may have chosen
not to implement a separate vfork(), i.e. it might just call fork().

Personally, I think a forkandexec() call would have been a better
idea than vfork(), but I wasn't the one making that decision.

Also, sorry, but I can't give you specific information about which
versions of Unix include copy-on-write, or how vfork() and fork()
compare on different versions.

Hope this helps...

  - Logan
--
                                sky-wild
                                 far cry
                             wing-slash-free
                      Christ is born for you and me