BUG() in asm/pci.h:142 with 2.4.13

BUG() in asm/pci.h:142 with 2.4.13

Post by Christian Hammer » Fri, 26 Oct 2001 19:10:11



Hello

My system crashed several times now with 2.4.11-pre6 and 2.4.13
(pre6 because it was the first one I got that fixed some 2GB RAM memory
allocation bug).

2.4.13 was the easiest one to reproduce: when starting the tape backup
to a HP DDS3/DAT Streamer (C1537A) via a Adaptec SCSI Controller
(Adaptec 7892A in /proc/pci) on a Gigabyte GA-6VTXD Dual Motherboard with
two PIII and 2GB of RAM it crashed immediately with the error attached
below. The machine was under "stresstest-simulation" load at this time.

The tape_backup.pl uses the "mt" and "cpio" commands to access /dev/nst0.

Maybe worth noting is, that the system crashed another time yesterday
after replacing the external SCSI RAID Chassis/Controller (not the
disks in it) and just this moment with another message (see below).

Any help or hints appreciated!
[please keep me Cc'ed as I'm not subscribed to this list]

bye,

 -christian-

kernel: kernel BUG at /usr/local/src/kernel/linux-2.4.13/include/asm/pci.h:142!
kernel: invalid operand: 0000
kernel: CPU:    1
kernel: EIP:    0010:[ahc_linux_run_device_queue+899/2144]    Not tainted
kernel: EFLAGS: 00010082
kernel: eax: 00000048   ebx: f7bb5650   ecx: c0275a88   edx: 00010071
kernel: esi: c5915a30   edi: 00000000   ebp: c5915a30   esp: e9ae3e14
kernel: ds: 0018   es: 0018   ss: 0018
kernel: Process tape_backup.pl (pid: 4366, stackpage=e9ae3000)
kernel: Stack: c024e100 0000008e f7bbec00 e9ae3e6c 00000000 00000000 f5358de0 0000000e
kernel:        f7bbec10 00000007 00000007 401af000 41ffffff 00000004 c5915600 c01b0e09  
kernel:        f7bbec00 c301fee0 00000202 d35ce1d4 c5915600 f7bbfa20 00000096 c01a5f76    
kernel: Call Trace: [ahc_linux_queue+361/424] [scsi_dispatch_cmd+354/632] [scsi_done+0/200] [scsi_request_fn+752/820] [__scsi_insert_special+110/128]  
kernel:    [scsi_insert_special_req+26/32] [scsi_do_req+284/324] [<f8a8940b>] [<f8a89240>] [<f8a8aad1>] [sys_write+143/196]
kernel:    [system_call+51/56]
kernel:
kernel: Code: 0f 0b eb 18 90 83 7e 04 00 75 14 68 90 00 00 00 68 00 e1 24

#
# The output from the other SCSI crash. This came from remote syslogging
# and console.
#

kernel: scsi0:0:0:0: Attempting to queue an ABORT message
kernel: (scsi0:A:0:0): Queuing a recovery SCB
kernel: scsi0:0:0:0: Device is disconnected, re-queuing SCB  
kernel: Recovery code sleeping
kernel: (scsi0:A:0:0): Abort Tag Message Sent
kernel: (scsi0:A:0:0): SCB 153 - Abort Tag Completed.
kernel: Recovery SCB completes
kernel: Recovery code awake  
kernel: aic7xxx_abort returns 8194
kernel: scsi0:0:0:0: Attempting to queue an ABORT message

Some more debugging help:

mtv-server:/usr/local/src/kernel/linux-2.4.13/include/asm# lspci    
00:00.0 Host bridge: VIA Technologies, Inc. VT82C691 [Apollo PRO] (rev c4)
00:01.0 PCI bridge: VIA Technologies, Inc. VT82C598 [Apollo MVP3 AGP]
00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 40)
00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev
06)
00:07.2 USB Controller: VIA Technologies, Inc. VT82C586B USB (rev 1a)
00:07.3 USB Controller: VIA Technologies, Inc. VT82C586B USB (rev 1a)
00:07.4 SMBus: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 40)
00:0a.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone]
(rev 30)
00:0c.0 SCSI storage controller: Adaptec 7892A (rev 02)
01:00.0 VGA compatible controller: ATI Technologies Inc Rage XL AGP (rev
27)

mtv-server:~$ cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
  Vendor: easyRAID Model:  U3              Rev: 0001
  Type:   Direct-Access                    ANSI SCSI revision: 03
Host: scsi0 Channel: 00 Id: 02 Lun: 00
  Vendor: HP       Model: C1537A           Rev: L708
  Type:   Sequential-Access                ANSI SCSI revision: 02

--
Christian Hammers    WESTEND GmbH - Aachen und Dueren     Tel 0241/701333-0

           WESTEND ist CISCO Systems Partner - Premium Certified

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

BUG() in asm/pci.h:142 with 2.4.13

Post by Christian Hammer » Sat, 27 Oct 2001 05:20:06


Hello

Now it crashed again when writing the tape but after writing 1GB to it.
This time I could capture the output of the extra new-queue debugging
output that I enabled in the kernel configuration (this time 2.4.12-ac6).

For the linux-scsi guys: if you need more information please take a look
at my previous posts to linux-kernel (same subject) or contact me directly.

bye,

 -christian-


> 2.4.13 was the easiest one to reproduce: when starting the tape backup
> to a HP DDS3/DAT Streamer (C1537A) via a Adaptec SCSI Controller
> (Adaptec 7892A in /proc/pci) on a Gigabyte GA-6VTXD Dual Motherboard with
> two PIII and 2GB of RAM it crashed immediately with the error attached
> below. The machine was under "stresstest-simulation" load at this time.

#
# console dump via minicom and serial line
#
 /USR/SBIN/CRON[10591]: (root) CMD
(/usr/local/maint/watchdog)
 kernel: scsi0:0:0:0: Attempting to queue an
ABORT message
 kernel: scsi0: Dumping Card State while idle, at
SEQADDR 0x8
 kernel: ACCUM = 0x0, SINDEX = 0x20, DINDEX =
0xe4, ARG_2 = 0x0
 kernel: HCNT = 0x0
 kernel: SCSISEQ = 0x12, SBLKCTL = 0xa
 kernel:  DFCNTRL = 0x0, DFSTATUS = 0x89
 kernel: LASTPHASE = 0x1, SCSISIGI = 0x0,
SXFRCTL0 = 0x80
 kernel: SSTAT0 = 0x0, SSTAT1 = 0x8
 kernel: SCSIPHASE = 0x0
 kernel: STACK == 0x3, 0x108, 0x160, 0x0
 kernel: SCB count = 248
 kernel: Kernel NEXTQSCB = 62
 kernel: Card NEXTQSCB = 62
 kernel: QINFIFO entries:
 kernel: Waiting Queue entries:
 kernel: Disconnected Queue entries: 6:150 12:210
2:178 29:28 23:181 13:61 24:7
 kernel: QOUTFIFO entries:
 kernel: Sequencer Free SCB List: 22 26 9 4 11 19
10 16 28 1 8 5 27 7 20 31 0 30
25 17 21 3 14 15 18
 kernel: Pending list: 150, 210, 178, 28, 181,
61, 7
 kernel: Kernel Free SCB list: 32 77 79 149 198
171 223 152 140 105 189 151 78 66
 199 88 6 224 138 177 67 84 194 191 23 246 215 24 160 185 225 230 93 174 49
241 110 2 20 147 170 240 33 59
243 99 54 175 19 176 18 192 76 100 190 238 108 8 159 208 207 60 242 217 56
221 1 17 213 92 127 70 162 74 19
7 142 239 196 82 124 29 235 134 232 123 179 218 139 211 117 3 119 57 219
125 122 209 101 44 155 45 39 212 1
28 233 202 158 91 187 46 0 180 182 201 109 118 228 131 12 4 112 229 200 236
173 132 247 97 186 148 55 216 1
33 144 113 231 30 63 37 137 206 156 83 146 135 141 161 64 165 98 35 234 166
81 9 10 214 43 58 111 71 115 10
6 85 183 72 11 204 172 157 130 47 154 188 226 90 220 96 107 27 227 145 40
87 22 94 129 48 205 65 120 73 163
 69 26 41 86 103 68 169 53 5 237 167 42 51 195 15 38 80 13 168 21 89 52 16
114 50 193 36 136 75 25 34 95 14
 153 203 126 222 116 31 143 104 164 121 102 184 245 244
 kernel: DevQ(0:0:0): 0 waiting
 kernel: DevQ(0:2:0): 0 waiting
 kernel: (scsi0:A:0:0): Queuing a recovery SCB
 kernel: scsi0:0:0:0: Device is disconnected,
re-queuing SCB
 kernel: Recovery code sleeping
 kernel: (scsi0:A:0:0): Abort Tag Message Sent
 kernel: (scsi0:A:0:0): SCB 7 - Abort Tag
Completed.
 kernel: Recovery SCB completes
 kernel: Recovery code awake
 kernel: aic7xxx_abort returns 0x2002
 sendmail[349]: rejecting connections on daemon
MTA: load average: 83
<80>xx<80>x<80>x<e<80>xx<xt<80><80><80>?x<?x<80>x<80>x<e?x?x?x<80><80><80><80><80>x??<80><80><80>

--
Christian Hammers    WESTEND GmbH - Aachen und Dueren     Tel 0241/701333-0

           WESTEND ist CISCO Systems Partner - Premium Certified

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/