2.5.70-mm3 - Oops and hang

2.5.70-mm3 - Oops and hang

Post by John Stoffe » Wed, 11 Jun 2003 15:20:20



Here's an oops and hang from /var/log/messages that happened the other
evening.  It kicked in at 4am or so.  This was running 2.5.70-mm3 SMP,
PREEMPT, no ACPI, RAID1 on one pair of disks, not root or /boot.

Here's the messages I got in the messages file.  The system was
completely hung and needed to be reset to recover:

Jun  9 04:02:12 jfsnew kernel: Unable to handle kernel paging request at virtual
 address 6b6b6b6b
Jun  9 04:02:12 jfsnew kernel:  printing eip:
Jun  9 04:02:12 jfsnew kernel: c0133477
Jun  9 04:02:12 jfsnew kernel: *pde = 00000000
Jun  9 04:02:12 jfsnew kernel: Oops: 0000 [#1]
Jun  9 04:02:12 jfsnew kernel: PREEMPTSMP DEBUG_PAGEALLOC
Jun  9 04:02:12 jfsnew kernel: CPU:    1
Jun  9 04:02:12 jfsnew kernel: EIP:    0060:[detach_pid+23/304]    Not tainted V
LI
Jun  9 04:02:12 jfsnew kernel: EIP:    0060:[<c0133477>]    Not tainted VLI
Jun  9 04:02:12 jfsnew kernel: EFLAGS: 00010046
Jun  9 04:02:12 jfsnew kernel: EIP is at detach_pid+0x17/0x130
Jun  9 04:02:12 jfsnew kernel: eax: dfd30050   ebx: 6b6b6b6b   ecx: dfd30100   e
dx: 6b6b6b6b
Jun  9 04:02:12 jfsnew kernel: esi: e3cc6000   edi: 00000000   ebp: 00000000   e
sp: e3cc7f08
Jun  9 04:02:12 jfsnew kernel: ds: 007b   es: 007b   ss: 0068
Jun  9 04:02:12 jfsnew kernel: Process makewhatis (pid: 2446, threadinfo=e3cc600
0 task=e4117000)
Jun  9 04:02:12 jfsnew kernel: Stack: dfd30000 e3cc6000 00000000 c0123a79 dfd300
00 c0123bb3 dfd30000 dfd30000
Jun  9 04:02:12 jfsnew kernel:        dfd305c4 dfd30000 00000a07 bffff5c8 c01258
cd dfd30000 ea854a74 bffff350
Jun  9 04:02:12 jfsnew kernel:        dfd300a4 dfd30000 e4117000 00000000 c0125c
75 dfd30000 bffff5c8 00000000
Jun  9 04:02:12 jfsnew kernel: Call Trace:
Jun  9 04:02:12 jfsnew kernel:  [__unhash_process+57/176] __unhash_process+0x39/
0xb0
Jun  9 04:02:12 jfsnew kernel:  [<c0123a79>] __unhash_process+0x39/0xb0
Jun  9 04:02:12 jfsnew kernel:  [release_task+195/560] release_task+0xc3/0x230
Jun  9 04:02:12 jfsnew kernel:  [<c0123bb3>] release_task+0xc3/0x230
Jun  9 04:02:12 jfsnew kernel:  [wait_task_zombie+397/432] wait_task_zombie+0x18
d/0x1b0
Jun  9 04:02:12 jfsnew kernel:  [<c01258cd>] wait_task_zombie+0x18d/0x1b0
Jun  9 04:02:12 jfsnew kernel:  [sys_wait4+357/640] sys_wait4+0x165/0x280
Jun  9 04:02:12 jfsnew kernel:  [<c0125c75>] sys_wait4+0x165/0x280
Jun  9 04:02:12 jfsnew kernel:  [default_wake_function+0/32] default_wake_functi
on+0x0/0x20
Jun  9 04:02:12 jfsnew kernel:  [<c011da40>] default_wake_function+0x0/0x20
Jun  9 04:02:13 jfsnew kernel:  [default_wake_function+0/32] default_wake_functi
on+0x0/0x20
Jun  9 04:02:13 jfsnew kernel:  [<c011da40>] default_wake_function+0x0/0x20
Jun  9 04:02:13 jfsnew kernel:  [syscall_call+7/11] syscall_call+0x7/0xb
Jun  9 04:02:13 jfsnew kernel:  [<c010af1f>] syscall_call+0x7/0xb
Jun  9 04:02:13 jfsnew kernel:
Jun  9 04:02:13 jfsnew kernel: Code: 51 08 52 e8 8c cd fe ff 58 5b c3 89 f6 8d b
c 27 00 00 00 00 57 56 53 89 d3 8d 14 9b 8d 04 d0 8d 88 b0 00 00 00 8b 59 08 8b
51 04 <39> 0a 74 08 0f 0b 8c 00 c8 8f 3a c0 8b 80 b0 00 00 00 39 48 04
Jun  9 04:02:13 jfsnew kernel:  <6>note: makewhatis[2446] exited with preempt_co
unt 2
Jun  9 04:02:13 jfsnew kernel: Debug: sleeping function called from illegal cont
ext at include/linux/rwsem.h:43
Jun  9 04:02:13 jfsnew kernel: Call Trace:
Jun  9 04:02:13 jfsnew kernel:  [__might_sleep+99/112] __might_sleep+0x63/0x70
Jun  9 04:02:13 jfsnew kernel:  [do_acct_process+381/656] do_acct_process+0x17d/
0x290
Jun  9 04:02:13 jfsnew kernel:  [<c013c0dd>] do_acct_process+0x17d/0x290
Jun  9 04:02:13 jfsnew kernel:  [__wake_up_common+64/96] __wake_up_common+0x40/0
x60
Jun  9 04:02:13 jfsnew kernel:  [<c011daa0>] __wake_up_common+0x40/0x60
Jun  9 04:02:13 jfsnew kernel:  [acct_process+133/223] acct_process+0x85/0xdf
Jun  9 04:02:13 jfsnew kernel:  [<c013c275>] acct_process+0x85/0xdf
Jun  9 04:02:13 jfsnew kernel:  [do_exit+187/1408] do_exit+0xbb/0x580
Jun  9 04:02:13 jfsnew kernel:  [<c0124fbb>] do_exit+0xbb/0x580
Jun  9 04:02:13 jfsnew kernel:  [smp_apic_timer_interrupt+320/352] smp_apic_time
r_interrupt+0x140/0x160
Jun  9 04:02:13 jfsnew kernel:  [<c0118660>] smp_apic_timer_interrupt+0x140/0x16

0
Jun  9 04:02:13 jfsnew kernel:  [__wake_up+109/192] __wake_up+0x6d/0xc0
Jun  9 04:02:13 jfsnew kernel:  [<c011db2d>] __wake_up+0x6d/0xc0
Jun  9 04:02:13 jfsnew kernel:  [apic_timer_interrupt+26/32] apic_timer_interrup
t+0x1a/0x20
Jun  9 04:02:13 jfsnew kernel:  [<c010b90e>] apic_timer_interrupt+0x1a/0x20
Jun  9 04:02:13 jfsnew kernel:  [sys_ptrace+523/1920] sys_ptrace+0x20b/0x780
Jun  9 04:02:13 jfsnew kernel:  [<c011007b>] sys_ptrace+0x20b/0x780
Jun  9 04:02:13 jfsnew kernel:  [die+286/400] die+0x11e/0x190
Jun  9 04:02:13 jfsnew kernel:  [<c010bfae>] die+0x11e/0x190
Jun  9 04:02:13 jfsnew kernel:  [do_page_fault+787/1118] do_page_fault+0x313/0x4
5e
Jun  9 04:02:13 jfsnew kernel:  [<c011b243>] do_page_fault+0x313/0x45e
Jun  9 04:02:13 jfsnew kernel:  [do_softirq+107/208] do_softirq+0x6b/0xd0
Jun  9 04:02:13 jfsnew kernel:  [<c0126c3b>] do_softirq+0x6b/0xd0
Jun  9 04:02:13 jfsnew kernel:  [free_pages_bulk+411/656] free_pages_bulk+0x19b/
0x290
Jun  9 04:02:13 jfsnew kernel:  [<c014033b>] free_pages_bulk+0x19b/0x290
Jun  9 04:02:13 jfsnew kernel:  [kmem_cache_free+496/608] kmem_cache_free+0x1f0/
0x260
Jun  9 04:02:13 jfsnew kernel:  [<c0145990>] kmem_cache_free+0x1f0/0x260
Jun  9 04:02:13 jfsnew kernel:  [free_task_struct+44/128] free_task_struct+0x2c/
0x80
Jun  9 04:02:13 jfsnew kernel:  [<c012018c>] free_task_struct+0x2c/0x80
Jun  9 04:02:13 jfsnew kernel:  [do_page_fault+0/1118] do_page_fault+0x0/0x45e
Jun  9 04:02:13 jfsnew kernel:  [<c011af30>] do_page_fault+0x0/0x45e
Jun  9 04:02:13 jfsnew kernel:  [error_code+45/56] error_code+0x2d/0x38
Jun  9 04:02:13 jfsnew kernel:  [<c010b989>] error_code+0x2d/0x38
Jun  9 04:02:13 jfsnew kernel:  [detach_pid+23/304] detach_pid+0x17/0x130
Jun  9 04:02:13 jfsnew kernel:  [<c0133477>] detach_pid+0x17/0x130
Jun  9 04:02:13 jfsnew kernel:  [__unhash_process+57/176] __unhash_process+0x39/
0xb0
Jun  9 04:02:13 jfsnew kernel:  [<c0123a79>] __unhash_process+0x39/0xb0
Jun  9 04:02:13 jfsnew kernel:  [release_task+195/560] release_task+0xc3/0x230
Jun  9 04:02:13 jfsnew kernel:  [<c0123bb3>] release_task+0xc3/0x230
Jun  9 04:02:13 jfsnew kernel:  [wait_task_zombie+397/432] wait_task_zombie+0x18
d/0x1b0
Jun  9 04:02:13 jfsnew kernel:  [<c01258cd>] wait_task_zombie+0x18d/0x1b0
Jun  9 04:02:13 jfsnew kernel:  [sys_wait4+357/640] sys_wait4+0x165/0x280
Jun  9 04:02:13 jfsnew kernel:  [<c0125c75>] sys_wait4+0x165/0x280
Jun  9 04:02:13 jfsnew kernel:  [default_wake_function+0/32] default_wake_functi
on+0x0/0x20
Jun  9 04:02:13 jfsnew kernel:  [<c011da40>] default_wake_function+0x0/0x20
Jun  9 04:02:13 jfsnew kernel:  [default_wake_function+0/32] default_wake_functi
on+0x0/0x20
Jun  9 04:02:13 jfsnew kernel:  [<c011da40>] default_wake_function+0x0/0x20
Jun  9 04:02:13 jfsnew kernel:  [syscall_call+7/11] syscall_call+0x7/0xb
Jun  9 04:02:13 jfsnew kernel:  [<c010af1f>] syscall_call+0x7/0xb
Jun  9 04:02:13 jfsnew kernel:
Jun  9 04:02:13 jfsnew kernel: bad: scheduling while atomic!
Jun  9 04:02:13 jfsnew kernel: Call Trace:
Jun  9 04:02:13 jfsnew kernel:  [schedule+61/1424] schedule+0x3d/0x590
Jun  9 04:02:13 jfsnew kernel:  [<c011d49d>] schedule+0x3d/0x590
Jun  9 04:02:13 jfsnew kernel:  [balance_dirty_pages_ratelimited+162/176] balanc
e_dirty_pages_ratelimited+0xa2/0xb0
Jun  9 04:02:13 jfsnew kernel:  [<c0141d82>] balance_dirty_pages_ratelimited+0xa
2/0xb0
Jun  9 04:02:13 jfsnew kernel:  [generic_file_aio_write_nolock+2438/2736] generi
c_file_aio_write_nolock+0x986/0xab0
Jun  9 04:02:14 jfsnew kernel:  [<c013ed56>] generic_file_aio_write_nolock+0x986
/0xab0
Jun  9 04:02:14 jfsnew kernel:  [scsi_delete_timer+15/80] scsi_delete_timer+0xf/
0x50
Jun  9 04:02:14 jfsnew kernel:  [<c02c87cf>] scsi_delete_timer+0xf/0x50
Jun  9 04:02:14 jfsnew kernel:  [rcu_process_callbacks+438/480] rcu_process_call
backs+0x1b6/0x1e0
Jun  9 04:02:14 jfsnew kernel:  [<c0133b26>] rcu_process_callbacks+0x1b6/0x1e0
Jun  9 04:02:14 jfsnew kernel:  [vt_console_print+105/688] vt_console_print+0x69
/0x2b0
Jun  9 04:02:14 jfsnew kernel:  [<c0270339>] vt_console_print+0x69/0x2b0
Jun  9 04:02:14 jfsnew kernel:  [vt_console_print+105/688] vt_console_print+0x69
/0x2b0
Jun  9 04:02:14 jfsnew kernel:  [<c0270339>] vt_console_print+0x69/0x2b0
Jun  9 04:02:14 jfsnew kernel:  [__call_console_drivers+70/96] __call_console_dr
ivers+0x46/0x60
Jun  9 04:02:14 jfsnew kernel:  [<c0123016>] __call_console_drivers+0x46/0x60
Jun  9 04:02:14 jfsnew kernel:  [call_console_drivers+235/256] call_console_driv
ers+0xeb/0x100
Jun  9 04:02:14 jfsnew kernel:  [<c012318b>] call_console_drivers+0xeb/0x100
Jun  9 04:02:14 jfsnew kernel:  [generic_file_aio_write+149/256] generic_file_ai
o_write+0x95/0x100
Jun  9 04:02:14 jfsnew kernel:  [<c013efd5>] generic_file_aio_write+0x95/0x100
Jun  9 04:02:14 jfsnew kernel:  [ext3_file_write+42/176] ext3_file_write+0x2a/0x
b0
Jun  9 04:02:14 jfsnew kernel:  [<c0198e2a>] ext3_file_write+0x2a/0xb0
Jun  9 04:02:14 jfsnew kernel:  [do_sync_write+170/224] do_sync_write+0xaa/0xe0
Jun  9 04:02:14 jfsnew kernel:  [<c015c2fa>] do_sync_write+0xaa/0xe0
Jun  9 04:02:14 jfsnew kernel:  [autoremove_wake_function+0/64] autoremove_wake_
function+0x0/0x40
Jun  9 04:02:14 jfsnew kernel:  [<c01206e0>] autoremove_wake_function+0x0/0x40
Jun  9 04:02:14 jfsnew kernel:  [call_console_drivers+235/256] call_console_driv
ers+0xeb/0x100
Jun  9 04:02:14 jfsnew kernel:  [<c012318b>] call_console_drivers+0xeb/0x100
Jun  9 04:02:14 jfsnew kernel:  [release_console_sem+323/400] release_console_se
m+0x143/0x190
Jun  9 04:02:14 jfsnew kernel:  [<c0123633>] release_console_sem+0x143/0x190
Jun  9 04:02:14 jfsnew kernel:  [printk+559/656] printk+0x22f/0x290
Jun  9 04:02:14 jfsnew kernel:  [<c012343f>] printk+0x22f/0x290
Jun  9 04:02:14 jfsnew kernel:  [show_trace+131/144] show_trace+0x83/0x90
Jun  9 04:02:14 jfsnew kernel:  [<c010bbc3>] show_trace+0x83/0x90
Jun  9
...

read more »

 
 
 

2.5.70-mm3 - Oops and hang

Post by Andrew Morto » Wed, 11 Jun 2003 21:50:14



> Here's an oops and hang from /var/log/messages that happened the other
> evening.  It kicked in at 4am or so.  This was running 2.5.70-mm3 SMP,
> PREEMPT, no ACPI, RAID1 on one pair of disks, not root or /boot.

> Here's the messages I got in the messages file.  The system was
> completely hung and needed to be reset to recover:

> Unable to handle kernel paging request at virtual
>  address 6b6b6b6b
>  printing eip:
> c0133477
> *pde = 00000000
> Oops: 0000 [#1]
> PREEMPTSMP DEBUG_PAGEALLOC
> CPU:    1
> EIP:    0060:[detach_pid+23/304]    Not tainted VLI
> EIP:    0060:[<c0133477>]    Not tainted VLI
> EFLAGS: 00010046
> EIP is at detach_pid+0x17/0x130
> eax: dfd30050   ebx: 6b6b6b6b   ecx: dfd30100   edx: 6b6b6b6b
> esi: e3cc6000   edi: 00000000   ebp: 00000000   esp: e3cc7f08
> ds: 007b   es: 007b   ss: 0068
> Process makewhatis (pid: 2446, threadinfo=e3cc6000 task=e4117000)
> Stack: dfd30000 e3cc6000 00000000 c0123a79 dfd30000 c0123bb3 dfd30000 dfd30000
>        dfd305c4 dfd30000 00000a07 bffff5c8 c01258cd dfd30000 ea854a74 bffff350
>        dfd300a4 dfd30000 e4117000 00000000 c0125075 dfd30000 bffff5c8 00000000
> Call Trace:
>  [__unhash_process+57/176] __unhash_process+0x39/0xb0
>  [<c0123a79>] __unhash_process+0x39/0xb0
>  [release_task+195/560] release_task+0xc3/0x230
>  [<c0123bb3>] release_task+0xc3/0x230
>  [wait_task_zombie+397/432] wait_task_zombie+0x18d/0x1b0
>  [<c01258cd>] wait_task_zombie+0x18d/0x1b0
>  [sys_wait4+357/640] sys_wait4+0x165/0x280
>  [<c0125c75>] sys_wait4+0x165/0x280
>  [default_wake_function+0/32] default_wake_function+0x0/0x20
>  [<c011da40>] default_wake_function+0x0/0x20
>  [default_wake_function+0/32] default_wake_function+0x0/0x20
>  [<c011da40>] default_wake_function+0x0/0x20
>  [syscall_call+7/11] syscall_call+0x7/0xb
>  [<c010af1f>] syscall_call+0x7/0xb

> Code: 51 08 52 e8 8c cd fe ff 58 5b c3 89 f6 8d bc 27 00 00 00 00 57 56 53 89 d3 8d 14 9b 8d 04 d0 8d 88 b0 00 00 00 8b 59 08 8b 51 04 <39> 0a 74 08 0f 0b 8c 00 c8 8f 3a c0 8b 80 b0 00 00 00 39 48 04

This appears to be a visitation from the Great Unsolved Bug of the 2.5
series.  Someone playing with a freed task_struct.

Correct me if I'm wrong, but this has only ever been seen with
CONFIG_PREEMPT=y?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

2.5.70-mm3 - Oops and hang

Post by John Stoffe » Wed, 11 Jun 2003 22:30:08


Andrew> This appears to be a visitation from the Great Unsolved Bug of
Andrew> the 2.5 series.  Someone playing with a freed task_struct.

Ouch, not a good thing.

Andrew> Correct me if I'm wrong, but this has only ever been seen with
Andrew> CONFIG_PREEMPT=y?

I can't say, but I'm willing to apply patches to see if I can help
track it down.  I'm about to put -mm7 up on my machine and let that
run for a while.  

Is there anything special you want me to do to stress this out?

John
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

1. 2.5.70 and 2.5.70-mm3 hang on bootup

Both 2.5.70 and 2.5.70-mm3 hang right after "Ok, booting the kernel..." on
one of my test boxes (at the point, nothing works, not even turning on/off
the NumLock LED).

Hardware: ASUS A7S333, Athlon 2600+, 1 GB RAM
Compiler: gcc 3.3

Haven't had the time to debug this yet (and probably won't have the time
for a couple more days, busy with work ATM); any ideas?

LLaP
bero
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

2. ICQ for Linux

3. [2.5.70][ANNOUNCE] kexec for 2.5.70 available

4. Ftp and telnet problems.

5. 2.5.70-mm3 with contest

6. Linux dying

7. software suspend in 2.5.70-mm3.

8. linux virtual private networking?

9. About 2.5.70-mm3

10. 2.5.70-mm3

11. capacity in 2.5.70-mm3

12. 2.5.70-mm3: ioctl32-cleanup-ppc64

13. Some oddities from 2.5.67 to 2.5.70-mm3