Quote:> What exactly do you mean by "supervise"?
> ...because it sounds like you mean "wait for it to die".
You are correct. The system is going to provide high-availability,
therefore one feature is that applications which exited, must be
restarted immediately. As it is possible that the supervising process
itself may fail (and will be restared) and processes can also be
restarted manually from outside and put themselves under control
of the supervising process by calling some libraray, there is no
parent-child relationship between them, thus signalling won't work.
Quote:> And it just
> so happens that I recently wrote a program to do this. Of course, my
> solution was "stat /proc/<pid> && sleep", so it's not quite the way you
> specify, but, as Linus said, it wins on points of being available now!
I cannot use such a solution because it uses active polling. If I do
this often enough to gurantee a certain reaction time, it would waste
too much CPU power. On the other hand I cannot sleep long time, because
restarting would be to slow. Therefore I need a blocking solution.
Quote:> I assume you have looked at the way some SysV systems do it - I just
> did a truss of pwait on a Solaris 7 system, and saw this
> so it works the way you describe. The Linux and Solaris poll man
> pages make them look reasonably similar. You've tried it and it didn't
Yes. Of course, the poll() or select() library function API is the same,
but the underlying implementation of the poll in the proc filesystem
is different. On Linux, if you are listening for POLLPRI, POLLERR or
POLLHUP events (e.g. when using /proc/<pid>/status), they won't be
detected. When the process dies, nothing happens.
Therefore my question, if someone has implemented the procfs (poll)
the SysV way.