Help with 4.3 mod to kill uninteruptable procs.

Help with 4.3 mod to kill uninteruptable procs.

Post by Lee Gat » Wed, 20 Feb 1991 09:19:41



        As a class project, I am working on a modification to the
BSD 4.3 source code to allow one to kill uninteruptable processes.

        It would seem that we are at a bit of a standpoint.  Initially,
I thought that I could have the kernel raise the priority of the suspect
process in the psignal() call, which after setting it to run, would
allow the process to release the resources it was sleeping with, and
exit gracefully, as I would post the kill signal before letting it run
again.  The others in my group have questioned this, and now I have
begun to wonder if it will work.

        Will the above method cause a race condition resulting from
the fact that the process probably assumes that the next time it runs
it will have the resource it was sleeping on?  And if so, I would
appreciate some other suggestions as to how to solve this problem.

        thanx
-- lee

------------randomly-chosen-drink/quote/simpsons'-quote---------------
"If you choose not to decide you still have made a choice."
                - Geddy Lee

 
 
 

Help with 4.3 mod to kill uninteruptable procs.

Post by John F Haugh » Thu, 21 Feb 1991 21:43:55




>|
>|   As a class project, I am working on a modification to the
>| BSD 4.3 source code to allow one to kill uninteruptable processes.

>A while back (~4-5 yrs) Chris Torek (I think) produced a nice little
>patch to the 4.3 kernel to kill groups of run away (and rapidly
>spawning) processes - this was the 'zonk' system call. You could probably
>gain an insight into your problem by looking at this. The catch is that
>I don't have access to the machine that I installed the patch on.

The problem is a bit more difficult than the normal * of a
process.  A process can be unkillable for any number of reasons
and just forcing it to exit can result in an unstable system.

Just for kicks, imagine some real slow device has been set up to
do a DMA transfer to some physical address that is held by the
process which is unkillable.  Imagine that you kill that process
and it exits.  Imagine the I/O completes and someone elses
memory gets trashed.  All that and more ...
--
John F. Haugh II                             UUCP: ...!cs.utexas.edu!rpp386!jfh

"I've never written a device driver, but I have written a device driver manual"
                -- Robert Hartman, IDE Corp.

 
 
 

Help with 4.3 mod to kill uninteruptable procs.

Post by Neil To » Thu, 21 Feb 1991 02:14:28



|
|       As a class project, I am working on a modification to the
| BSD 4.3 source code to allow one to kill uninteruptable processes.

(rest deleted)

A while back (~4-5 yrs) Chris Torek (I think) produced a nice little
patch to the 4.3 kernel to kill groups of run away (and rapidly
spawning) processes - this was the 'zonk' system call. You could probably
gain an insight into your problem by looking at this. The catch is that
I don't have access to the machine that I installed the patch on.

Zonk was very useful, especially on Student/teaching machines - one could
guarantee that some bright spark would experiement with self spawining
processes, zonk would kill all jobs owned by a particular UID stone dead.

Neil

 
 
 

Help with 4.3 mod to kill uninteruptable procs.

Post by Lars Henrik Mathies » Fri, 22 Feb 1991 08:21:18


There are two types of ``unkillable'' processes in 4.3BSD. The first
are those that are sleeping at ``fast device priority'' when the
device hangs. The second are those that are in the kernel exit code
and get to sleep on last close of a device.

The first class you probably shouldn't mess with. If you're lucky,
removing the sleeping process will only result in the loss of some
buffer. In worse cases, you get permanently un-openable devices or
crashes. The real cure for these is to rewrite _each_case_ to sleep at
interruptible priority and clean up properly (more than a class
project, I think).

For the second class, the real problem is that the kill signal is
blocked inside the exit call. If you put some signal-catching code in
exit, you can delay this blocking until after the open files are
closed. This may allow some slow devices to closed in an irregular
way, but that possibility already exists in the case of explicit close
calls.

--
Lars Mathiesen, DIKU, U of Copenhagen, Denmark      [uunet!]mcsun!diku!thorinn

 
 
 

Help with 4.3 mod to kill uninteruptable procs.

Post by Y. Rock L » Fri, 22 Feb 1991 23:57:05



Quote:>Just for kicks, imagine some real slow device has been set up to
>do a DMA transfer to some physical address that is held by the
>process which is unkillable.  Imagine that you kill that process
>and it exits.  Imagine the I/O completes and someone elses
>memory gets trashed.  All that and more ...

Please excuse my ignorance on the block devices (most of the time
I work on character/streams device).

Forcibly awaking a process doing read/write (DMA transter) will either
give the process a buffer with garbage data or throw away a buffer
containing valid data. How can this trash someone else's memory?

Y. Rock Lee, att!cblph!rock

 
 
 

Help with 4.3 mod to kill uninteruptable procs.

Post by Y. Rock L » Sat, 23 Feb 1991 00:06:24



Quote:>The first class you probably shouldn't mess with. If you're lucky,
>removing the sleeping process will only result in the loss of some
>buffer. In worse cases, you get permanently un-openable devices or
>crashes. The real cure for these is to rewrite _each_case_ to sleep at
>interruptible priority and clean up properly (more than a class
>project, I think).

[this is a guess, not an argument]

The "permanently un-openable devices" can only happen in the case of open.
Because open wasn't "complete" so the close call in the exit cannot do a
correct clean up. Please correct me if I miss something.

Y. Rock Lee, att!cblph!rock

 
 
 

Help with 4.3 mod to kill uninteruptable procs.

Post by Y. Rock L » Sat, 23 Feb 1991 00:28:45



>    Will the above method cause a race condition resulting from
>the fact that the process probably assumes that the next time it runs
>it will have the resource it was sleeping on?  And if so, I would
>appreciate some other suggestions as to how to solve this problem.

Yes, the process will think it has the resource it was sleeping on.
But, it will be killed and release the resource during its exit
before it has a chance to "think". This part looks OK to me.
My only concern is that the driver of the particular device which
the process is waiting for may react crazily when it is misinformed
(a good driver should guard against this).

Y. Rock Lee, att!cblph!rock

 
 
 

Help with 4.3 mod to kill uninteruptable procs.

Post by Chris Tor » Sat, 23 Feb 1991 00:45:32



>A while back (~4-5 yrs) Chris Torek (I think) produced a nice little
>patch to the 4.3 kernel to kill groups of run away (and rapidly
>spawning) processes - this was the 'zonk' system call.

``Not I,'' said the pig.  (Since I just ate half a dozen chocolate
chip cookies, I think I qualify. :-) )

Seriously: I never produced this particular bletcherous hack.  (I am
responsible for a number of other, different bletcherous hacks, but
not this one.)  If (A) you have SIGSTOP and (B) signals work correctly,
the super-user can stop everything, pick out the bad processes, kill
them, and then resume everything.  (This is a bit tricky to get right,
admittedly.)
--
In-Real-Life: Chris Torek, Lawrence Berkeley Lab EE div (+1 415 486 5427)

 
 
 

Help with 4.3 mod to kill uninteruptable procs.

Post by Dan Bernste » Sat, 23 Feb 1991 04:57:45



> If (A) you have SIGSTOP and (B) signals work correctly,
> the super-user can stop everything, pick out the bad processes, kill
> them, and then resume everything.  (This is a bit tricky to get right,
> admittedly.)

What's tricky about it?

  #include <sys/time.h>
  #include <sys/resource.h>
  #include <signal.h>
  #include <stdio.h>
  #include <errno.h>
  extern int errno;

  main(argc,argv,envp) /* invoke as, e.g., zonk /bin/csh csh -f; untested */
  int argc;
  char *argv[];
  char *envp[];
  {
   if (getuid()) { fprintf(stderr,"zonk: fatal: uid not 0\n"); exit(1); }
   if (geteuid()) { fprintf(stderr,"zonk: fatal: euid not 0\n"); exit(2); }
   if (setpriority(PRIO_PROCESS,0,-20))
     fprintf(stderr,"zonk: weird: can't set my priority to -20\n");
   if (kill(-1,SIGSTOP) == -1) perror("zonk: warning: first kill failed");
   if (kill(-1,SIGSTOP) == -1) perror("zonk: warning: second kill failed");
   if (kill(-1,SIGSTOP) == -1) perror("zonk: warning: good-luck kill failed");
   for (;;)
    {
     (void) execve(argv[1],argv + 2,envp);
     perror("zonk: critical: exec failed, will try again");
     sleep(60);
    }
  }

---Dan

 
 
 

Help with 4.3 mod to kill uninteruptable procs.

Post by Pat Barr » Sat, 23 Feb 1991 14:27:21




>>        Will the above method cause a race condition resulting from
>>the fact that the process probably assumes that the next time it runs
>>it will have the resource it was sleeping on?  And if so, I would
>>appreciate some other suggestions as to how to solve this problem.

>Yes, the process will think it has the resource it was sleeping on.

Uhh, nope.  When a particular even occurs, *all* processes waiting on
that event are awakened.  By the time you run again, someone else may
have snarfed up the resource you were waiting for.

This has been the case forever (well, at least since V7) - when you come
out of a sleep(), you *must* check that the reason you were sleeping is
no longer true....

--Pat.

 
 
 

Help with 4.3 mod to kill uninteruptable procs.

Post by John F Haugh » Sun, 24 Feb 1991 05:53:42



Quote:>Yes, the process will think it has the resource it was sleeping on.
>But, it will be killed and release the resource during its exit
>before it has a chance to "think". This part looks OK to me.
>My only concern is that the driver of the particular device which
>the process is waiting for may react crazily when it is misinformed
>(a good driver should guard against this).

This just isn't true.

A typical sleep loop looks something like

        while (some_status & some_busy_flag)
                sleep (&some_status, PRI_O_MINE);

        some_status |= some_busy_flag;

If your only concern is getting this process to ignore the setting
of "some_busy_flag", you might be doing the right thing - but
remember - "some_status" still has the "some_busy_flag" set.  Killing
the process will not get that bit clear and if that bit being set
is what is* the process, the next process to enter that loop
is also going to hang.

What is needed is an exception routine that understands =exactly=
what to do to reset the resource to some well-defined state for any
possible state the resource may be in.
--
John F. Haugh II      |      I've Been Moved     |    MaBellNet: (512) 838-4340
SneakerNet: 809/1D064 |          AGAIN !         |      VNET: LCCB386 at AUSVMQ
BangNet: ..!cs.utexas.edu!ibmchs!auschs!snowball.austin.ibm.com!jfh (e-i-e-i-o)

 
 
 

Help with 4.3 mod to kill uninteruptable procs.

Post by John F Haugh » Sat, 23 Feb 1991 13:50:38



Quote:>Forcibly awaking a process doing read/write (DMA transter) will either
>give the process a buffer with garbage data or throw away a buffer
>containing valid data. How can this trash someone else's memory?

DMA addresses typically refer to physical memory.  The process requesting
the DMA transfer normally is locked in memory before the transfer is
requested so that the physical address the controller was told to send
the data to will remain valid.  If the process dies and the physical
memory is reallocated (or page out or swap out or ... occurs), that
physical address will be allocated to some other process which isn't
expecting to have your DMA transfer sent its way.
--
John F. Haugh II                             UUCP: ...!cs.utexas.edu!rpp386!jfh

"I've never written a device driver, but I have written a device driver manual"
                -- Robert Hartman, IDE Corp.
 
 
 

Help with 4.3 mod to kill uninteruptable procs.

Post by Y. Rock L » Sun, 24 Feb 1991 11:59:58




>>Yes, the process will think it has the resource it was sleeping on.
>>But, it will be killed and release the resource during its exit
>>before it has a chance to "think". This part looks OK to me.

>A typical sleep loop looks something like

>    while (some_status & some_busy_flag)
>            sleep (&some_status, PRI_O_MINE);

>    some_status |= some_busy_flag;

This was what I had in mind (which was wrong after I double checked it):

        A signal puts this sleeping process back into the run queue (its
        priority has been set to higher than PZERO). sleep doesn't return;
        it does a longjmp back to syscall. Before the system call returns,
        it checks if there is a signal. There is. So, it handles the signal
        and exits (no signal handling routine set).

The catch is that the process went to sleep before we change its priority.
In this case sleep goes different route and does a simple return.
Therefore, we will continue execute the driver code, which may be dangerous!

Quote:>What is needed is an exception routine that understands =exactly=
>what to do to reset the resource to some well-defined state for any
>possible state the resource may be in.

That's the reason why system priority is chosen to begin with.
So, don't mess with it IF you can convince your professor not to do
this project, :-)  But, I guess, it is OK to do "experiment" in school.

On the other hand, this utility can be very useful. If a process is*
but cannot be killed (sleeping in uninterruptable priority), you have two
ways to get rid of it: use this utility or reboot the system. That is,
this utility can be useful, but is DANGEROUS in general!

Y. Rock Lee, att!cblph!rock

 
 
 

Help with 4.3 mod to kill uninteruptable procs.

Post by Stanley Fries » Sun, 24 Feb 1991 04:57:36



Quote:>Forcibly awaking a process doing read/write (DMA transter) will either
>give the process a buffer with garbage data or throw away a buffer
>containing valid data. How can this trash someone else's memory?

Scenario - forcibly wake-up a process waiting on a slow device,
        process mucks about with unfilled buffer, and *releases* it.
        process probably dies due to signal the woke it up.
        another process acquires the buffer from the free pool
                (Perhaps intending to use it for paging)
        slow device finally finishes, putting results into buffer.
        new process returns the buffer or (worse) uses it as a new page.

        VOILA - corrupted stuff in the new process.
--
---------------
uunet!tdatirv!sarima                            (Stanley Friesen)

 
 
 

Help with 4.3 mod to kill uninteruptable procs.

Post by Lee Gat » Sun, 24 Feb 1991 16:55:44



Quote:

>This was what I had in mind (which was wrong after I double checked it):

>    A signal puts this sleeping process back into the run queue (its
>    priority has been set to higher than PZERO). sleep doesn't return;
>    it does a longjmp back to syscall. Before the system call returns,
>    it checks if there is a signal. There is. So, it handles the signal
>    and exits (no signal handling routine set).

>The catch is that the process went to sleep before we change its priority.
>In this case sleep goes different route and does a simple return.
>Therefore, we will continue execute the driver code, which may be dangerous!

>>What is needed is an exception routine that understands =exactly=
>>what to do to reset the resource to some well-defined state for any
>>possible state the resource may be in.

>That's the reason why system priority is chosen to begin with.
>So, don't mess with it IF you can convince your professor not to do
>this project, :-)  But, I guess, it is OK to do "experiment" in school.

>On the other hand, this utility can be very useful. If a process is*
>but cannot be killed (sleeping in uninterruptable priority), you have two
>ways to get rid of it: use this utility or reboot the system. That is,
>this utility can be useful, but is DANGEROUS in general!

        Since I posted, I have discussed it with him, and we have narrowed
the field considerably.  The only check the modification will do is to see
if a serial device driver is locked.  If so, it will release the serial
device, and kill the process.  Otherwise, it will have a double check
to see if it really should kill the proc, then actually kill it.

        I feel I understand everything much better now, all I have to do
is figure out where in the code to look...  Thanx to all for your help!

        lee

 
 
 

1. more help with 4.3 mod (more questions from lee)

-------
For those who missed the earlier discussion:

For my class, our group project is to make a BSD mod to 4.3 to be
able to kill a process (any process), and release its resources.

From my last posting, a few have pointed out that I was unclear,
probably because I am floundering in kernel terminology.

Anyways, the project has narrowed once again, and I wanted to bounce
it off the net.more.knowing.than.i.persons....  

Project:  Create a program that will take two arguments

        1.      Process ID to kill.  (discussed earlier)
        2.      Device to free up.   (ie /dev/printer)

The only resources we are really concerned about freeing up are the
serial devices, as the printers sometimes hang uninterruptably, and
the tape backup drive.

After reading a few of the responses I got from earlier postings, it
seemed that I could use the cyreset() syscall to reset the device
driver and free up the (serial) device.  Is this correct, other
responses I got said no, and there is some confusion here.

I was thinking that I could kill the process, then make the cyreset()
call to free up the resource, but then realized the kernel may query
the driver to see if it will accept the request.  Help me here, I
don't really have enough info to speculate.

I guess what it boils down to is that I can kill it, but I just
need to know how to reset the resource (given the filesystem name)
so that other processes can use it.

thanx to all have helped and will help.

-- lee

------------randomly-chosen-drink/quote/simpsons'-quote---------------
"Give a hoot.  Read a book."
                -- Krusty in _Krusty_Gets_Busted_

2. redhat9.0 installation on Laptop

3. child process killed on 4.3, not on 4.2

4. Initrd doesn't work

5. upgrade to XF86 4.3 kills keyboard

6. Problem in compile xine 0.9.2!

7. Is AIX 4.3.x binary compatible to 4.3.x

8. frame buffers, etc..

9. Xamian on FreeBSD 4.3 || X 4.1.0 on FreeBSD 4.3

10. bringing up 4.3 on a uvaxII with 4.3 on a 780

11. HELP: Killing bg proc in scripts

12. HELP::procs and kill

13. PROPOSAL: /proc standards (was dot-proc interface [was: /proc