Process in "uninterruptible sleep"?

Process in "uninterruptible sleep"?

Post by Sami Lai » Wed, 13 Nov 1996 04:00:00



I've been watching for one instance of sendmail over a week now. Ps shows
it's status as 'D' and manual page tells me that process is in
"uninterruptible sleep".

I figured out (very fast and easy) that you this process can't be killed
with kill or any other normal signaling utility. Problem here is that one of
disks is locked with this process, so there doesn't seem to be any way to
unmount all volumes cleanly.

Does someone know some sophisticated method to kill this process or is
reset-switch only option left?

Following stuff is from /proc..
Name:   sendmail
State:  D (disk sleep)
Pid:    26089
PPid:   1
Uid:    0       0       0       0
Gid:    0       0       0       0
VmSize:     1312 kB
VmLck:         0 kB
VmRSS:       800 kB
VmData:      344 kB
VmStk:        28 kB
VmExe:         0 kB
VmLib:         0 kB
SigPnd: 0000eff9
SigBlk: 00000000
SigIgn: 80001006
SigCgt: 00006201

Thanks.

--

PGP Public key available from http://www.iki.fi/lane/pubkey.asc.

 
 
 

Process in "uninterruptible sleep"?

Post by Raul SILVE » Wed, 13 Nov 1996 04:00:00


: I've been watching for one instance of sendmail over a week now. Ps shows
: it's status as 'D' and manual page tells me that process is in
: "uninterruptible sleep".

'D' status usually means that the process is waiting for a filesystem
operation to finish. What you have to do, is to let the operation finish,
by removing the conflicting situation, and the process should go out
of 'D' state.

For example, if you mount a floopy, and you remove it without
unmounting it while one process is reading the floppy, that process
will probably go to 'D' state until you put the floppy back in the
drive. I guess that if you use hard NFS mounts and your NFS server
dies, your processes will go to 'D' state too.

Maybe strace can give you some hints on what is the problem on
your particuar case.

My .02$

_____________________________________________________________________________
Raul Silvera M.   | M.Sc. Student at  |See my web page:          

-----------------------------------------------------------------------------

 
 
 

Process in "uninterruptible sleep"?

Post by James Youngm » Thu, 14 Nov 1996 04:00:00



Quote:>I figured out (very fast and easy) that you this process can't be killed
>with kill or any other normal signaling utility. Problem here is that one of
>disks is locked with this process, so there doesn't seem to be any way to
>unmount all volumes cleanly.

>Does someone know some sophisticated method to kill this process or is
>reset-switch only option left?

To preserve data, you can remount that disk as read-only without actually
unmounting it.
        mount /dev/foo -o remount,ro

--
James Youngman       VG Gas Analysis Systems |The trouble with the rat-race
 Before sending advertising material, read   |is, even if you win, you're
http://www.law.cornell.edu/uscode/47/227.html|still a rat.

 
 
 

Process in "uninterruptible sleep"?

Post by Albert D. Cahal » Thu, 14 Nov 1996 04:00:00




>> I've been watching for one instance of sendmail over a week now.
>> Ps shows it's status as 'D' and manual page tells me that process
>> is in "uninterruptible sleep".

> 'D' status usually means that the process is waiting for
> a filesystem operation to finish. What you have to do, is
> to let the operation finish, by removing the conflicting
> situation, and the process should go out of 'D' state.

> For example, if you mount a floopy, and you remove it without
> unmounting it while one process is reading the floppy, that
> process will probably go to 'D' state until you put the floppy
> back in the  drive. I guess that if you use hard NFS mounts and
> your NFS server dies, your processes will go to 'D' state too.

Ok, what do I do in this case:

There is a vital server or long-running calculation on the machine,
so I can't just reboot. Some luser creates a program that mallocs
and writes to a huge amount of memory, then writes to the floppy.
Then the luser yanks out the floppy and shreds it.

Now 90% of the virtual memory is locked in a process I can't kill.
I can't reboot either. All I can do is add more swap.

As appropriate, substitute "tripped over the Zip drive power cord
and broke the drive", "pryed open CD-ROM drive to steal the CD-ROM",
or "yanked QIC tape out to run for lunch".
--
--
Albert Cahalan
acahalan at cs.uml.edu (no junk mail please - I will hunt you down)

 
 
 

Process in "uninterruptible sleep"?

Post by G Sumner Haye » Fri, 15 Nov 1996 04:00:00



Quote:

> Ok, what do I do in this case:

> There is a vital server or long-running calculation on the machine,
> so I can't just reboot. Some luser creates a program that mallocs
> and writes to a huge amount of memory, then writes to the floppy.
> Then the luser yanks out the floppy and shreds it.

> Now 90% of the virtual memory is locked in a process I can't kill.
> I can't reboot either. All I can do is add more swap.

What happens if you stick a random floppy disk in the drive?
Preferrably one that contains no important data, of course.

I don't know what will happen, it's simply a suggestion.  

I've thought about it for a while and I'm not sure whether this is a
kernel problem or not; after all, it's impossible to guarantee
anything if you don't have physical security.  It would be nice to
have a workaround, but if it's going to negatively impact other parts
of the system it's probably not worth it, IMO.  The user could just
pull the machine's power cord or hit it hard with a hammer, after all.
But I could be convinced otherwise...

Cordially,

  Sumner

Please don't CC: postings to me, my mailbox is already full enough.

 
 
 

Process in "uninterruptible sleep"?

Post by Martin Hedback EHS/LI ????? 66 » Fri, 15 Nov 1996 04:00:00





> >> I've been watching for one instance of sendmail over a week now.
> >> Ps shows it's status as 'D' and manual page tells me that process
> >> is in "uninterruptible sleep".

The process is obviously in kernel mode waiting for a system call to
return.  In the kernel mode each process is given a priority. When a
critical system call, ie certain I/O calls, the kernel assigns the
process a priority above 10. All priorities above 10 cannot be
interrupted with any signal . You can use adb on /dev/kmem or ps to
check the priority of the process. To avoid dead-locks the device
driver should time-out and return, maybee this is not true in your
case.

 > 'D' status usually means that the process is waiting for

Quote:> > a filesystem operation to finish. What you have to do, is
> > to let the operation finish, by removing the conflicting
> > situation, and the process should go out of 'D' state.

> > For example, if you mount a floopy, and you remove it without
> > unmounting it while one process is reading the floppy, that
> > process will probably go to 'D' state until you put the floppy
> > back in the  drive. I guess that if you use hard NFS mounts and
> > your NFS server dies, your processes will go to 'D' state too.

> Ok, what do I do in this case:

> There is a vital server or long-running calculation on the machine,
> so I can't just reboot. Some luser creates a program that mallocs
> and writes to a huge amount of memory, then writes to the floppy.
> Then the luser yanks out the floppy and shreds it.

> Now 90% of the virtual memory is locked in a process I can't kill.
> I can't reboot either. All I can do is add more swap.

> As appropriate, substitute "tripped over the Zip drive power cord
> and broke the drive", "pryed open CD-ROM drive to steal the CD-ROM",
> or "yanked QIC tape out to run for lunch".
> --
> --
> Albert Cahalan
> acahalan at cs.uml.edu (no junk mail please - I will hunt you down)

 
 
 

Process in "uninterruptible sleep"?

Post by Ingo Molna » Fri, 15 Nov 1996 04:00:00



: > For example, if you mount a floopy, and you remove it without
: > unmounting it while one process is reading the floppy, that
: > process will probably go to 'D' state until you put the floppy
: > back in the  drive. I guess that if you use hard NFS mounts and
: > your NFS server dies, your processes will go to 'D' state too.

: Ok, what do I do in this case:

: There is a vital server or long-running calculation on the machine,
: so I can't just reboot. Some luser creates a program that mallocs
: and writes to a huge amount of memory, then writes to the floppy.
: Then the luser yanks out the floppy and shreds it.

: Now 90% of the virtual memory is locked in a process I can't kill.
: I can't reboot either. All I can do is add more swap.

: As appropriate, substitute "tripped over the Zip drive power cord
: and broke the drive", "pryed open CD-ROM drive to steal the CD-ROM",
: or "yanked QIC tape out to run for lunch".

in the case where the kernel cannot know what's going on in the Real
World, it simply cannot kill that process. If you can be sure that
a particular device is dead after a well-defined timeout, then you
can go tell the device driver writer to put in some watchdog timer
to clean* stuff up. This is a device-driver issue, not a
kernel-signal-handling issue.

-- mingo

 
 
 

Process in "uninterruptible sleep"?

Post by Albert D. Cahal » Fri, 15 Nov 1996 04:00:00



Quote:>> Ok, what do I do in this case:

>> There is a vital server or long-running calculation on the machine,
>> so I can't just reboot. Some luser creates a program that mallocs
>> and writes to a huge amount of memory, then writes to the floppy.
>> Then the luser yanks out the floppy and shreds it.

>> Now 90% of the virtual memory is locked in a process I can't kill.
>> I can't reboot either. All I can do is add more swap.
> I've thought about it for a while and I'm not sure whether this is a
> kernel problem or not; after all, it's impossible to guarantee
> anything if you don't have physical security.  It would be nice to
> have a workaround, but if it's going to negatively impact other parts
> of the system it's probably not worth it, IMO.  The user could just
> pull the machine's power cord or hit it hard with a hammer, after all.
> But I could be convinced otherwise...

It does not matter if I have physical security. Maybe the computer
is in a sealed room with a raised floor, halon fire extinguishers,
air conditioning, alarm system... and a disk crashes. That could
leave a process that can never be killed.

kill -DIE_DIE_DIE 24951      :-)

At the very least, the process should be completely removed from the
scheduler, have all files closed, have all the memory deallocated...
Maybe a few pages would still be stuck, but at least most memory would
be recovered and a clean shutdown would be possible.
--
--
Albert Cahalan
acahalan at cs.uml.edu (no junk mail please - I will hunt you down)

 
 
 

Process in "uninterruptible sleep"?

Post by William Burr » Sat, 16 Nov 1996 04:00:00


: For example, if you mount a floopy, and you remove it without
: unmounting it while one process is reading the floppy, that process
: will probably go to 'D' state until you put the floppy back in the
: drive.

No, the process will stay in the D state FOREVER (or until the server
goes down, whichever comes first, YMMV, etc.).  Do I have to shout this
from the rooftops or what?

(It appears that the floppy is handled by interrupts only, so don't
expect the driver to do magic by itself.)

--
William Burrow  --  Fredericton Area Network, New Brunswick, Canada
Copyright 1996 William Burrow  
Linux floppy disk handling sucks.  

 
 
 

Process in "uninterruptible sleep"?

Post by Ingo Molna » Sat, 16 Nov 1996 04:00:00



: It does not matter if I have physical security. Maybe the computer
: is in a sealed room with a raised floor, halon fire extinguishers,
: air conditioning, alarm system... and a disk crashes. That could
: leave a process that can never be killed.

: kill -DIE_DIE_DIE 24951      :-)

: At the very least, the process should be completely removed from the
: scheduler, have all files closed, have all the memory deallocated...
: Maybe a few pages would still be stuck, but at least most memory would
: be recovered and a clean shutdown would be possible.

i dont think this could be done cleanly. You could disturb driver-internal
counters/locks/status variables. It is really the driver's responsibility
to allocate stuff and to free it up. There is a >reason< why the driver
used TASK_UNINTERRUPTIBLE. The reason is atomicity with scheduling.

If you want to have these problems fixed, then do it at the level of the
device driver: find out wether the ZIP driver should use a watchdog timer
to monitor status port access. Many of the networking cards do so for
example, probably because pulling the network cable out is much more
common than prying a CD-ROM drive open :).

IMHO, it's a device driver problem, not a kernel problem.

for one point you might be right: TASK_INTERRUPTIBLE could be device-specific,
thus a process having open files on one device couldnt block a clean shutdown.
This is the price we pay for the slightly incorrect device driver i think.

-- mingo

 
 
 

Process in "uninterruptible sleep"?

Post by Jay R. Ashwor » Mon, 18 Nov 1996 04:00:00



: Ok, what do I do in this case:
: There is a vital server or long-running calculation on the machine,
: so I can't just reboot. Some luser creates a program that mallocs
: and writes to a huge amount of memory, then writes to the floppy.
: Then the luser yanks out the floppy and shreds it.
: Now 90% of the virtual memory is locked in a process I can't kill.
: I can't reboot either. All I can do is add more swap.
: As appropriate, substitute "tripped over the Zip drive power cord
: and broke the drive", "pryed open CD-ROM drive to steal the CD-ROM",
: or "yanked QIC tape out to run for lunch".

Yup.  There ought to be some interactive, administrator driven command
to kill a process sleeping on a fast device (ie, less than (or greater
than, I don't remember anymore) PZERO.

Gurus: why not?

Cheers,
-- jr 'kill -666' a
--

Member of the Technical Staff                    Junk Mail Will Be Billed For.
The Suncoast Freenet      *FLASH: Craig Shergold aw'better; call 800-215-1333*
Tampa Bay, Florida    http://members.aol.com/kyop/rhps.html    +1 813 790 7592

 
 
 

Process in "uninterruptible sleep"?

Post by Ray Auchterloun » Tue, 19 Nov 1996 04:00:00




[...]
>>Does someone know some sophisticated method to kill this process or is
>>reset-switch only option left?
>To preserve data, you can remount that disk as read-only without actually
>unmounting it.
>    mount /dev/foo -o remount,ro

I've tried this when in this situation and it fails - complaining the
device is busy. I ended up hitting reset switch.

ray

--

         "Forty Two! Is that all you've got to show for
          seven and a half million years' work?"

 
 
 

1. Killing a process in "uninterruptible sleep"

A segmentation fault occured while trying to mount a Mac (HFS) formatted Zip
disk on a PC running linux 2.0.33. Even though it did not successfully mount,
I tried to umount it before removing the disk. The umount process hung and ps
shows the status of the process is "uninterruptible sleep".  

Now I can't do a soft reboot because reboot seems to want all processes killed
and even kill -9 as root will not kill the umount process. How can I recover
from this situation so that I can reboot gracefully? Or better yet, how do I
return to a normal situation where I can eject the disk? As it stands there is
no way to manually remove the Zip disk and the software method eject does not
seem to work either. (This is why I am trying to reboot) If I am forced to
power-down abruptly I will probably have to run fsck manually --- something
I'd prefer not having to do.

Any suggestions?

Thanks,
Alan

2. Installing redhat 7.1 in text mode with RAID Setting

3. How to "kill" uninterruptible sleeping process?

4. Should I continue to use XENIX?

5. GETSERVBYNAME()????????????????????"""""""""""""

6. How to get ndbm support for Apache on FreeBSD box??

7. """"""""My SoundBlast 16 pnp isn't up yet""""""""""""

8. moving linux around disks

9. Type "(", ")" and "{", "}" in X...

10. Install hangs on "idle process cannot sleep"

11. Linux 1.3.12: Bootup error "idle processes may not sleep"

12. Killing a process in "disk sleep"

13. CD-R , "disk sleep" state and hung processes