related oops in 2.4.17 and 2.4.19

related oops in 2.4.17 and 2.4.19

Post by Wilton Won » Mon, 23 Sep 2002 01:50:06



Are you running a symbois/LSI based SCSI controller ? just wondering, becaise
we had a whole bunch of "running short of DMA buffer" errors untill we switched
to the newer sym83cxx driver.. we woulget a whole bunch of these errors after a
random amount of disk activity and then boom we would oops..

- Wilton


> I am running both a 2.4.17 and 2.4.19 vanilla patched with bproc. I have 2
> oops traces from klogd, that appear to be related. I am not sure if this
> is a bproc or kernel issue. The kernels have been running on Dell 1550
> PIII Dual machines with 2GB ram and 2GB swap and IBM x330s with the same
> setups. I have seen about 20 machines toss this same oops in the last few
> days, all when under heavy load and memory pressure. After oopsing, the
> machine will remain resonsive and can run processes, baring ps, top,
> shutdown and reboot -- I am sure there are more that won't run :) I would
> greatly appreciate any help -- I have almost 200 machines running these
> kernels. The oopsing has not just started recently -- I have seen it all
> along, but have never been able to get the decoded info from klogd, and it
> is just now becoming a problem with so many machines oopsing.

> Attached is one file with 2 oops reports from klogd.
> Nic

> --
> Nicholas Henke
> Linux cluster system programmer
> University of Pennsylvania


Content-Description: oops traces

- Show quoted text -

Quote:> -------- 2.4.17 --------------

> Sep 19 18:49:10 node25 kernel: kernel BUG at page_alloc.c:84!
> Sep 19 18:49:10 node25 kernel: invalid operand: 0000
> Sep 19 18:49:10 node25 kernel: CPU:    1
> Sep 19 18:49:10 node25 kernel: EIP:    0010:[__free_pages_ok+169/832]    Not tainted
> Sep 19 18:49:10 node25 kernel: EIP:    0010:[<c013c0d9>]    Not tainted
> Sep 19 18:49:10 node25 kernel: EFLAGS: 00010286
> Sep 19 18:49:10 node25 kernel: eax: 0000001f   ebx: c1b3ffa0   ecx: c0288424   edx: 000036aa
> Sep 19 18:49:10 node25 kernel: esi: c1b3ffa0   edi: 00000000   ebp: 00000000   esp: c5d8dee0
> Sep 19 18:49:10 node25 kernel: ds: 0018   es: 0018   ss: 0018
> Sep 19 18:49:10 node25 kernel: Process ps (pid: 1326, stackpage=c5d8d000)
> Sep 19 18:49:10 node25 kernel: Stack: c025c74d 00000054 ea5a4a41 e947f35c e947f000 ea5a4000 bfff0018 0000090f
> Sep 19 18:49:10 node25 kernel:        c1b3ffa0 e947f90f e947f90f c0125607 c5d8c000 e947f000 f21a270c c1b3ffa0
> Sep 19 18:49:10 node25 kernel:        e0e9b980 f21a270c e9092000 e947f000 e947f000 c01691fa e9092000 bffff6e5
> Sep 19 18:49:10 node25 kernel: Call Trace: [access_process_vm+439/560] [proc_pid_environ+186/208] [proc_info_read+99/288] [filp_open+77/96] [sys_read+150/208]
> Sep 19 18:49:10 node25 kernel: Call Trace: [<c0125607>] [<c01691fa>] [<c0169653>] [<c01442ed>] [<c0144e76>]
> Sep 19 18:49:10 node25 kernel:    [sys_open+203/304] [system_call+51/56]
> Sep 19 18:49:10 node25 kernel:    [<c01446db>] [<c01078ab>]
> Sep 19 18:49:10 node25 kernel:
> Sep 19 18:49:10 node25 kernel: Code: 0f 0b 5e 5f 8b 43 18 a9 80 00 00 00 74 10 6a 56 68 4d c7 25
> Sep 20 04:02:14 node25 syslogd 1.4.1: restart.
> Sep 20 14:00:00 node25 sshd(pam_unix)[1655]: session opened for user bindu by (uid=0)
> Sep 20 14:27:10 node25 sshd(pam_unix)[1655]: session closed for user bindu
> Sep 20 15:22:51 node25 sshd(pam_unix)[1683]: session opened for user root by (uid=0)
> Sep 20 15:23:37 node25 sshd(pam_unix)[1731]: session opened for user henken by (uid=0)
> Sep 20 15:23:37 node25 sshd(pam_unix)[1731]: session closed for user henken
> Sep 20 15:23:44 node25 sshd(pam_unix)[1739]: session opened for user henken by (uid=0)
> Sep 20 15:23:44 node25 sshd(pam_unix)[1739]: session closed for user henken

> ----------- 2.4.19 ------

> Sep 21 10:15:34 node25.io.liniac.upenn.edu kernel: Warning - running *really* short on DMA buffers
> Sep 21 10:39:10 node25.io.liniac.upenn.edu last message repeated 156 times
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000010
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel:  printing eip:
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: f89cee23
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: *pde = 00000000
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: Oops: 0000
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: CPU:    1
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: EIP:    0010:[<f89cee23>]    Not tainted
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: EFLAGS: 00010202
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: eax: 00000000   ebx: 00000000   ecx: 00000002   edx: f70c0000
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: esi: f70c0000   edi: 00000000   ebp: ffffffff   esp: daabbe8c
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: ds: 0018   es: 0018   ss: 0018
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: Process ps (pid: 2266, stackpage=daabb000)
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: Stack: c015f525 f70c0000 000002cc 000002cc 00000000 ffffffff 00000004 0000002c
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel:        00000000 00000088 00000000 00000002 00000000 00000000 00000000 00000013
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel:        00000000 00000000 00000000 004a190c 00000000 00000000 ffffffff 00000000
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: Call Trace:    [proc_pid_stat+629/752] [do_exit+719/736] [proc_info_read+99/288] [sys_read+150/272] [sys_open+87/160]
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: Call Trace:    [<c015f525>] [<c011f86f>] [<c015d0a3>] [<c013de76>] [<c013d827>]
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel:   [system_call+51/56]
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel:   [<c0108cfb>]
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel:
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: Code: 8b 40 10 c3 90 8b 82 94 00 00 00 8b 40 7c c3 89 f6 f6 05 60

 ----[   Wilton William Wong    ]---------------------------------------------
   11060-166 Avenue         Ph : 01-780-456-9771       High Performance UNIX
   Edmonton, Alberta        FAX: 01-780-456-9772        and Linux Solutions
   T5X 1Y3, Canada                   URL: http://www.harddata.com
 -------------------------------------------------------[ Hard Data Ltd. ]----

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

related oops in 2.4.17 and 2.4.19

Post by Nicholas Henk » Mon, 23 Sep 2002 02:00:05


nope -- all adaptec:
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.8
        <Adaptec aic7899 Ultra160 SCSI adapter>
        aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs

scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.8
        <Adaptec aic7899 Ultra160 SCSI adapter>
        aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs


> Are you running a symbois/LSI based SCSI controller ? just wondering, becaise
> we had a whole bunch of "running short of DMA buffer" errors untill we switched
> to the newer sym83cxx driver.. we woulget a whole bunch of these errors after a
> random amount of disk activity and then boom we would oops..

> - Wilton

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

1. Geode crashed 2.4.17, runs under 2.4.19

I have bothered you few months ago that the following Geode modules:

Centarus platform, GX1 300 MHz, 5530A REv B1, BIOS: XpressROM V3.0.1 build 03/08/2001 by NatSemi

were crashing about once a week running 2.4.17 with low latency, HZ=2000,
ext3fs.

Since then I have upgraded to 2.4.19-rc3 with similar configuration and the
boards are running OK since a few months.

Best regards,
--
Tomasz Motylewski
BFAD GmbH & Co. KG

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

2. USR I-Modem Setup

3. NULL dereference for nfs-root using 2.4.19-pre8, 2.4.17 is fine

4. chatscripts under Redhat

5. Oops on 2.4.17 - unable to handle kernel paging request - possibly NFS related?

6. New hard drive

7. Oops on 2.4.17 possibly UMS-DOS related

8. Notebook with Xircom Pcmcia Ethernet

9. Reproducible oops in 2.4.17, possibly related to ipchains

10. 2.4.19: Oops, probably related to SCSI?

11. some 2.4.17 vs. 2.4.17-rmap8 vs. lowmem analysis

12. 2.4.17, looks ext3 related

13. 2.4.17 Kernel Oops