why a separate process for each thread on Linux

why a separate process for each thread on Linux

Post by Alex H » Thu, 10 May 2001 03:28:51



Does anyone know why both LinuxThread and PThreads on Linux are forking a
process for each thread?  Is there a thread lib for Linux that doesn't do
that.

I just want all secondary threads belonging to an application to run in the
same process as the main thread.

Best Regards,
Alex Ho

 
 
 

why a separate process for each thread on Linux

Post by Alexander Vi » Thu, 10 May 2001 04:04:45




>Does anyone know why both LinuxThread and PThreads on Linux are forking a
>process for each thread?

Why not? You can have VM shared between them, ditto for file descriptors,
etc. What else matters?

--
"You're one of those condescending Unix computer users!"
"Here's a nickel, kid.  Get yourself a better computer" - Dilbert.

 
 
 

why a separate process for each thread on Linux

Post by Juergen Hein » Thu, 10 May 2001 05:40:35





>>Does anyone know why both LinuxThread and PThreads on Linux are forking a
>>process for each thread?

>Why not? You can have VM shared between them, ditto for file descriptors,
>etc. What else matters?

[-]
It's a complete mess compared to other systems to start with.

See man 2 clone for more since threads are based on this system
call. Still things are free to change and beware as there's quite
a difference between fork(2) and clone(2).

It's true though that, up to now, each thread gets its own process
id, but this doesn't make it a separate process.

Ta',
Juergen

--
\ Real name     : Juergen Heinzl                \       no flames      /

 
 
 

why a separate process for each thread on Linux

Post by phil-news-nos.. » Thu, 10 May 2001 16:16:14



| See man 2 clone for more since threads are based on this system
| call. Still things are free to change and beware as there's quite
| a difference between fork(2) and clone(2).
|
| It's true though that, up to now, each thread gets its own process
| id, but this doesn't make it a separate process.

Indeed!

The processes created by clone(2) are quite tightly bound.  I see
the very things there which make me want to avoid threads in the
first place (and use normal processes instead).  For example, if
some code needs to chdir(2) to go deep into a directory tree, it
has to first suspend all other threads because chdir(2) will effect
their view of the world.

I wonder what the OP really needs.  Is it just to be able to use a
familiar API, or is there truly a need for contexts so tightly bound
that they share the same PID (which appears to be something that
clone(2) supports, anyway, via the CLONE_PID flag)?

--
-----------------------------------------------------------------
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |

-----------------------------------------------------------------

 
 
 

why a separate process for each thread on Linux

Post by Greg Copelan » Fri, 11 May 2001 00:26:09



> The processes created by clone(2) are quite tightly bound.  I see
> the very things there which make me want to avoid threads in the
> first place (and use normal processes instead).  For example, if
> some code needs to chdir(2) to go deep into a directory tree, it
> has to first suspend all other threads because chdir(2) will effect
> their view of the world.

I see this example used all of the time when people want to frown on
threaded programming.  I for one, don't buy it.  I've coded lots of
multi-threaded applications and have never once had this as an issue
like this.  Let's face it, it's not hard to serialize something like
this.  Once you have a file handle to the file in question, it's not
like you have to keep chdir()ing every time you turn around.

I've never once needed to write an application that created two
threads which did nothing but change directories in each thread.
Besides, normally, you'll be better served having a select few
thread(s) taking care of this type of detail for you.  Even still,
there are lots of others way to crack this nut.

In a nutshell, threaded programming is just another model which may
or may not apply to the specific problem domain you are trying to
address.  It's nothing more, or less.  Just another tool in the shed.

--
Greg Copeland, Principal Consultant
Copeland Computer Consulting
--------------------------------------------------
PGP/GPG Key at http://www.keyserver.net
DE5E 6F1D 0B51 6758 A5D7  7DFE D785 A386 BD11 4FCD
--------------------------------------------------

 
 
 

why a separate process for each thread on Linux

Post by Alexander Vi » Fri, 11 May 2001 01:29:21



>The processes created by clone(2) are quite tightly bound.  I see
>the very things there which make me want to avoid threads in the
>first place (and use normal processes instead).  For example, if
>some code needs to chdir(2) to go deep into a directory tree, it
>has to first suspend all other threads because chdir(2) will effect
>their view of the world.

Huh? Don't set CLONE_FS in flags and you have independent chdir for child.

--
"You're one of those condescending Unix computer users!"
"Here's a nickel, kid.  Get yourself a better computer" - Dilbert.

 
 
 

why a separate process for each thread on Linux

Post by Eric Taylo » Fri, 11 May 2001 02:06:43




> >The processes created by clone(2) are quite tightly bound.  I see
> >the very things there which make me want to avoid threads in the
> >first place (and use normal processes instead).  For example, if
> >some code needs to chdir(2) to go deep into a directory tree, it
> >has to first suspend all other threads because chdir(2) will effect
> >their view of the world.

> Huh? Don't set CLONE_FS in flags and you have independent chdir for child.

I see there is a way to share pids? Is that still un-implemented or
have things changed?

A couple of other questions/issues between a system that has
specific threads and linux:

1. scheduling can be diferent. The threads might  share the timeslice
of the single process they are working for, Does linux do anything
about this esp. on a uniprocessor?
2. Exiting or killing the process should kill all the threads, won't
linux leave the cloned processes running?
3. do clones share stacks? do they get fixed sizes or can they
grow like a process?
4. where do the signals go?

et

Quote:

> --
> "You're one of those condescending Unix computer users!"
> "Here's a nickel, kid.  Get yourself a better computer" - Dilbert.

 
 
 

why a separate process for each thread on Linux

Post by Alexander Vi » Fri, 11 May 2001 02:36:19




Quote:>I see there is a way to share pids? Is that still un-implemented or
>have things changed?

Hell knows. What for? To emulated Solaris idiocy?

Quote:>A couple of other questions/issues between a system that has
>specific threads and linux:
>1. scheduling can be diferent. The threads might  share the timeslice
>of the single process they are working for, Does linux do anything
>about this esp. on a uniprocessor?

Again, what for? There are ways to set priority. nice(2), for one thing.
Why reinvent the wheel? If you share VM with previous task schedule()
is more likely to pick you.

Quote:>2. Exiting or killing the process should kill all the threads, won't
>linux leave the cloned processes running?

Again, why? _If_ you want that behaviour - do it yourself. Signals
had been there since times immemorial and it's perfectly doable
in userland. It's a policy decision and thus userland is where it
belongs.

Quote:>3. do clones share stacks? do they get fixed sizes or can they
>grow like a process?

Not necessary and it depends on the mmap() flags used to create a
stack for child.

Quote:>4. where do the signals go?

Where you send them. Why?

Process is a collection of resources. VM is one of them. Processes can
have some resources in common. Why bother with LWPs when context switch
between the normal processes that share VM is equally cheap?

Occam's Razor and all such... BTW, it's not like the thing was invented in
Linux - it comes from Plan 9 and is shared with *BSD (man rfork there).

 
 
 

why a separate process for each thread on Linux

Post by phil-news-nos.. » Fri, 11 May 2001 03:07:02



| I see this example used all of the time when people want to frown on
| threaded programming.  I for one, don't buy it.  I've coded lots of
| multi-threaded applications and have never once had this as an issue
| like this.  Let's face it, it's not hard to serialize something like
| this.  Once you have a file handle to the file in question, it's not
| like you have to keep chdir()ing every time you turn around.
|
| I've never once needed to write an application that created two
| threads which did nothing but change directories in each thread.
| Besides, normally, you'll be better served having a select few
| thread(s) taking care of this type of detail for you.  Even still,
| there are lots of others way to crack this nut.
|
| In a nutshell, threaded programming is just another model which may
| or may not apply to the specific problem domain you are trying to
| address.  It's nothing more, or less.  Just another tool in the shed.

I've written library code that could fail if it depended on being
able to access a file via a full path.  It really needed to do a
chdir() to accomplish what it needed to do.  The concern with it
was that if someone used this in a threaded program, and some other
thread simply did a file open in what what it thought was the current
directory, it could be in for a rude surprise.

I've been doing multi process programming since getting into UNIX.
The cases I've worked with have been in more need of seperate
resources than in shared memory.  So they would fit the process
model better.  And it didn't require another API.  I've also done
multi task (similar to multi threaded) on mainframes (MVS).  So
I guess this colors my view of things.  I'll just have to see when
my first case of needing multi-threading comes about.

--
-----------------------------------------------------------------
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |

-----------------------------------------------------------------

 
 
 

why a separate process for each thread on Linux

Post by phil-news-nos.. » Fri, 11 May 2001 03:16:44



| I see there is a way to share pids? Is that still un-implemented or
| have things changed?

Based on what I read it is implemented, but causes "problems" since
other parts of the kernel think all processes have unique PID.  For
example, the /proc/NNNN entries are likely to be usless.  There needs
to be some way to identify each thread beyond a thread entry pointer.
Maybe a PID:TID tuple?  But a lot of stuff would have to change to
handle that.

| 2. Exiting or killing the process should kill all the threads, won't
| linux leave the cloned processes running?

I know this principle is used, but I'm not sure sure it's the right
way.  But then, I'm probably clouded in thought by having had only
projects that need the process model, as opposed to the thread model.

| 3. do clones share stacks? do they get fixed sizes or can they
| grow like a process?

That can't be allowed to avoid corruption.  If you're sharing the
whole VM, then each thread has to have its own stack context.  And
with stacks that grow linearly, you have to pick starting points
and divvy up the VM space by picking a suitable stack size.  If the
stack is made truly separate, then you have other issues, like
threads can share some pointers (gotten from malloc()) but not
others (gotten from auto variables or alloca() ... but then I'd
bet alloca() is thread-unsafe anyway).

| 4. where do the signals go?

I guess they are a pushed context in the same process/thread.

--
-----------------------------------------------------------------
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |

-----------------------------------------------------------------

 
 
 

why a separate process for each thread on Linux

Post by Alexander Vi » Fri, 11 May 2001 04:02:32




>| 3. do clones share stacks? do they get fixed sizes or can they
>| grow like a process?

>That can't be allowed to avoid corruption.  If you're sharing the

That _can_ be allowed if parent and child cooperate. Not for long,
but it's doable. All you need is to block parent until the child
will tell that it's OK. Or block child and let parent flip the
child's stack via ptrace(). See vfork() for example of the first
approach, BTW.

--
"You're one of those condescending Unix computer users!"
"Here's a nickel, kid.  Get yourself a better computer" - Dilbert.

 
 
 

why a separate process for each thread on Linux

Post by Greg Copelan » Fri, 11 May 2001 04:08:15




> | I see there is a way to share pids? Is that still un-implemented or
> | have things changed?

> Based on what I read it is implemented, but causes "problems" since
> other parts of the kernel think all processes have unique PID.  For
> example, the /proc/NNNN entries are likely to be usless.  There needs
> to be some way to identify each thread beyond a thread entry pointer.
> Maybe a PID:TID tuple?  But a lot of stuff would have to change to
> handle that.

Hey, this is the Phil that I know.  I use your linuxhomepage still.
I like it.

For what it's worth, I do know that some thought it going into
additional thread support in the Linux kernel.  I just don't know
what form it will take yet.

--
Greg Copeland, Principal Consultant
Copeland Computer Consulting
--------------------------------------------------
PGP/GPG Key at http://www.keyserver.net
DE5E 6F1D 0B51 6758 A5D7  7DFE D785 A386 BD11 4FCD
--------------------------------------------------

 
 
 

why a separate process for each thread on Linux

Post by Eric Taylo » Fri, 11 May 2001 04:28:59





> >I see there is a way to share pids? Is that still un-implemented or
> >have things changed?

> Hell knows. What for? To emulated Solaris idiocy?

I was wondering what was intended by that feature and whether
any further progress had been made.

No matter how idiotic Solaris might be, it is worthwhile to be
able to have their programs port to linux w/o having to change
source code.

Quote:

> >A couple of other questions/issues between a system that has
> >specific threads and linux:

> >1. scheduling can be diferent. The threads might  share the timeslice
> >of the single process they are working for, Does linux do anything
> >about this esp. on a uniprocessor?

> Again, what for? There are ways to set priority. nice(2), for one thing.
> Why reinvent the wheel? If you share VM with previous task schedule()
> is more likely to pick you.

Let's see, that sounds like: "Why would anyone ever want to do that?"

Ah, yes, reminds me of the old Decus sessions.  The Dec O.S. guys
would ask that 10 times a day.  I once heard a tired developer respond
to a what if question: "Well!! What if I fly off this stage!"

The last thing a programmer should ask is What For? Unless he wears
a big G on his shirt.

Let's see, maybe I don't want to write my own scheduler, but also don't
want my 50 compute threads to act like 50 processes and take over
the system.

Quote:

> >2. Exiting or killing the process should kill all the threads, won't
> >linux leave the cloned processes running?

> Again, why? _If_ you want that behaviour - do it yourself. Signals
> had been there since times immemorial and it's perfectly doable
> in userland. It's a policy decision and thus userland is where it
> belongs.

So, do you bother to deallocate all memory on exit, or do you
chase down all open i/o channels, or do you simply exit. why
why why. I wouldn't be asking if there were no reason. And
just becuase one  can solve the problem writing code doesn't
mean it's good  to be forced to write code.

Quote:

> >3. do clones share stacks? do they get fixed sizes or can they
> >grow like a process?

> Not necessary and it depends on the mmap() flags used to create a
> stack for child.

Suppose I have a program that has a  local array (on the stack) and
I want to send a pointer to it to one or more of my threads. Will
that work with clone? And even if you think that is stupid, suppose
I have to port a program that does it and I don't have time  to modify it.

Quote:

> Process is a collection of resources. VM is one of them. Processes can
> have some resources in common. Why bother with LWPs when context switch
> between the normal processes that share VM is equally cheap?

So you can easily port code. Clone is a great system call, but I can't use
it since it is not portable. There are still other systems in the world I have

to be aware of.

 
 
 

why a separate process for each thread on Linux

Post by Alexander Vi » Fri, 11 May 2001 05:19:00







>> >I see there is a way to share pids? Is that still un-implemented or
>> >have things changed?

>> Hell knows. What for? To emulated Solaris idiocy?

>I was wondering what was intended by that feature and whether
>any further progress had been made.

>No matter how idiotic Solaris might be, it is worthwhile to be
>able to have their programs port to linux w/o having to change
>source code.

_If_ it doesn't mean putting extra cruft into the kernel. Otherwise you'd
have to copy every idiotic feature ever invented. Not a good idea.

Quote:>Let's see, maybe I don't want to write my own scheduler, but also don't
>want my 50 compute threads to act like 50 processes and take over
>the system.

Lower their priority and be done with that. Same effect, no new API needed.

Quote:>So, do you bother to deallocate all memory on exit, or do you
>chase down all open i/o channels, or do you simply exit. why
>why why. I wouldn't be asking if there were no reason. And
>just becuase one  can solve the problem writing code doesn't
>mean it's good  to be forced to write code.

Deallocating memory and closing files avoids kernel leaks. It _can't_
be reliably done in userland. Delivering signals to processes you want
is doable there. Moreover, _not_ doing that is a valid decision and
such decision belongs to userland.

Quote:>> >3. do clones share stacks? do they get fixed sizes or can they
>> >grow like a process?

>> Not necessary and it depends on the mmap() flags used to create a
>> stack for child.

>Suppose I have a program that has a  local array (on the stack) and
>I want to send a pointer to it to one or more of my threads. Will
>that work with clone? And even if you think that is stupid, suppose
>I have to port a program that does it and I don't have time  to modify it.

If these clones share VM - sure, it will (they may have separate VM,
but shared file descriptors - combination that also makes sense and
again, that's a decision belonging to userland).

Quote:>> Process is a collection of resources. VM is one of them. Processes can
>> have some resources in common. Why bother with LWPs when context switch
>> between the normal processes that share VM is equally cheap?

>So you can easily port code. Clone is a great system call, but I can't use
>it since it is not portable. There are still other systems in the world I have
>to be aware of.

Let's see. By that logics we would have to implement
        * streams (every Missed'em'V out there)
        * doors (Solaris)
        * environment-dependent symlinks (dual-universe monsters)
        * mpx (remember that v7 misfeature?)
        * /dev/poll (SGI barfball)
        * several mutually incompatible variants of devfs
        * logicals (Vomit Making System)
        * drive letters (DEC-derived systems, including CP/M branch)
        * AST (VMS)
I can easily continue, but I think you already see what I mean.

BTW, I would like to see Solaris supporting notes, clean userland filesystems,
per-process namespaces and _both_ rfork() a-la BSD (== clone() in Linux) and
rfork() a-la Plan 9 (stack is always private). You see, there's a bunch of
very nice programs that rely on them and I'd love to see them on Solaris.
Yes, some of them are ported, but it would be nice to have all this stuff
natively. Call me when it happens, OK?

--
"You're one of those condescending Unix computer users!"
"Here's a nickel, kid.  Get yourself a better computer" - Dilbert.

 
 
 

why a separate process for each thread on Linux

Post by Juergen Hein » Fri, 11 May 2001 06:21:46




>>The processes created by clone(2) are quite tightly bound.  I see
>>the very things there which make me want to avoid threads in the
>>first place (and use normal processes instead).  For example, if
>>some code needs to chdir(2) to go deep into a directory tree, it
>>has to first suspend all other threads because chdir(2) will effect
>>their view of the world.

>Huh? Don't set CLONE_FS in flags and you have independent chdir for child.

[-]
Given clone(2) is Linux specific clone(2) does not exist, you've never
heard of clone(2), you know of no-one who'd ever heard of clone(2)
and Montezuma's revenge shall come over you the day you dare to think
of how nice a clone(2) system call would be.

You get the idea ;)

Ta',
Juergen

--
\ Real name     : Juergen Heinzl                \       no flames      /