assertion failure : ext3 & lvm , 2.4.17 smp & 2.4.18-ac1 smp

assertion failure : ext3 & lvm , 2.4.17 smp & 2.4.18-ac1 smp

Post by Dimitris Zilasko » Thu, 28 Feb 2002 08:10:17



[1.] One line summary of the problem:
assertion failure messages on an lvm partition with ext3  filesystem

[2.] Full description of the problem/report:

 each time some continued write activity occures on an lvm device
consisted of loopback mounted files ,  a big assertion failure error comes
out and the device becomes unusable . A reboot from the power button / reset
is required since the system is unable to reboot on its own , even in singe
user mode . kjournald and process dealing with the lvm parition are in
uninterruptable sleep and unkillable .
  The issue started the moment i converted an ext2 filesystem on the lvm
device to ext3 . First tried with 2.4.17-smp , continued with 2.4.18-ac1 .
 Both patched with smbfs-2.4.16-lfs.patch , modified slightly in the
2.4.18 case to apply cleanly .
  The lvm device is consisted of 10 loopback mounted files of 23 gbytes ,
which reside on samba mounted shares .

  Everything works ok if i mount as ext2 .
[3.] Keywords (i.e., modules, networking, kernel):

lvm 1.0.3 , ext3  , loopback , 2.4.17 smp 2.4.18-ac1 smp
[4.] Kernel version (from /proc/version):
Linux version 2.4.18-ac1 (root@tassadar) (gcc version 2.95.3 20010315
(release)) #2 SMP Tue Feb 26 23:13:44 EET 2002

[5.] Output of Oops.. message (if applicable) with symbolic information
     resolved (see Documentation/oops-tracing.txt)
well there is not oops word in the message but looks like one so i decoded
one anyway

 they look in the kernel logs like this :

Assertion failure in do_get_write_access() at transaction.c:730: "(((jh2bh(jh))->b_state & (1UL << BH_Uptodate)) != 0)"
invalid operand: 0000
CPU:    1
EIP:    0010:[do_get_write_access+1206/1344] Not tainted
EFLAGS: 00010282
eax: 0000007b   ebx: d7f7da94   ecx: 00000056 edx: 00000001
esi: c59df9a0   edi: d7f7da00   ebp: c9a73820 esp: ce4edcd4
ds: 0018   es: 0018   ss: 0018
Process wget (pid: 466, stackpage=ce4ed000)
Stack: c02887c0 c0288bce c02887a0 000002da c0288d80 d7f7da00 c59df9a0 c9a73820
d7f7da94 00000001 00000001 00000000 00000000 c72f04c0 c016277d c59df9a0
c9a73820 00000000 c59df9a0 0000154b c1c22c00 ce4edd8c c0157ae8 c59df9a0
Call Trace: [journal_get_write_access+57/92] [ext3_new_block+1056/1908] [ext3_alloc_block+30/36]
[ext3_alloc_branch+63/648] [ext3_get_block_handle+458/696]
[ext3_get_block+90/96] [__block_prepare_write+219/732] [block_prepare_write+34/60] [ext3_get_block+0/96]
[ext3_prepare_write+213/452] [ext3_get_block+0/96]
[generic_file_write+1160/1880] [ext3_file_write+70/76] [sys_write+143/256] [system_call+51/56]
Code: 0f 0b 83 c4 14 90 8b 4d 00 8b 41 38 0f b6 50 25 8b 7d 0c 8b

here is another one , decoded

ksymoops 2.4.3 on i686 2.4.18-ac1.  Options used
     -V (default)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.4.18-ac1/ (default)
     -m /boot/System.map (specified)

No modules in ksyms, skipping objects
Warning (read_lsmod): no symbols in lsmod, is /proc/modules a valid lsmod
file?
Warning (compare_maps): ksyms_base symbol ___strtok not found in
System.map.  Ignoring ksyms_base entry
Warning (compare_maps): mismatch on symbol partition_name  , ksyms_base
says c020c120, System.map says c0156740.  Ignoring ksyms_base entry
invalid operand: 0000
CPU:    1
EIP:    0010:[<c015d745>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010286
eax: 0000007b   ebx: d8298e80   ecx: 00000056   edx: 00000001
esi: da44ae94   edi: d8298e80   ebp: da44ae00   esp: c726dce8
ds: 0018   es: 0018   ss: 0018
Process wget (pid: 13623, stackpage=c726d000)
Stack: c0278200 c027860e c02781e0 000002d8 c02787c0 da44ae00 c8c0d2e0
d8298e80
       da44ae94 00000001 00000001 00000000 00000000 c9f32be0 c015d7e9
c8c0d2e0
Call Trace: [<c015d7e9>] [<c0152e00>] [<c0154aa6>] [<c0154d9b>]
[<c015541e>]
   [<c0134ef0>] [<c0155566>] [<c01353e2>] [<c01359ee>] [<c015550c>]
[<c0155a0d>]

   [<c015550c>] [<c01289bb>] [<c01536ea>] [<c0132e46>] [<c0106d9b>]
Code: 0f 0b 83 c4 14 8d b6 00 00 00 00 8b 03 8b 7b 0c 8b 50 38 8b

>>EIP; c015d744 <ext3_add_entry+1a4/400>   <=====

Trace; c015d7e8 <ext3_add_entry+248/400>
Trace; c0152e00 <proc_pid_lookup+13c/1ec>
Trace; c0154aa6 <kmsg_poll+2a/44>
Trace; c0154d9a <proc_tty_unregister_driver+26/2c>
Trace; c015541e <devices_read_proc+6/34>
Trace; c0134ef0 <get_unused_fd+a0/188>
Trace; c0155566 <cmdline_read_proc+e/54>
Trace; c01353e2 <sys_lseek+12/cc>
Trace; c01359ee <do_readv_writev+236/29c>
Trace; c015550c <dma_read_proc+24/34>
Trace; c0155a0c <elf_kcore_store_hdr+34/2f0>
Trace; c015550c <dma_read_proc+24/34>
Trace; c01289ba <sys_msync+76/104>
Trace; c01536ea <proc_register+7a/88>
Trace; c0132e46 <shmem_writepage+3e/114>
Trace; c0106d9a <lcall7+a/4c>
Code;  c015d744 <ext3_add_entry+1a4/400>
0000000000000000 <_EIP>:
Code;  c015d744 <ext3_add_entry+1a4/400>   <=====
   0:   0f 0b                     ud2a      <=====
Code;  c015d746 <ext3_add_entry+1a6/400>
   2:   83 c4 14                  add    $0x14,%esp
Code;  c015d748 <ext3_add_entry+1a8/400>
   5:   8d b6 00 00 00 00         lea    0x0(%esi),%esi
Code;  c015d74e <ext3_add_entry+1ae/400>
   b:   8b 03                     mov    (%ebx),%eax
Code;  c015d750 <ext3_add_entry+1b0/400>
   d:   8b 7b 0c                  mov    0xc(%ebx),%edi
Code;  c015d754 <ext3_add_entry+1b4/400>
  10:   8b 50 38                  mov    0x38(%eax),%edx
Code;  c015d756 <ext3_add_entry+1b6/400>
  13:   8b 00                     mov    (%eax),%eax

3 warnings issued.  Results may not be reliable.

[6.] A small shell script or example program which triggers the
     problem (if possible)
/usr/bin/wget -r  -l inf -nr -nc -b -np -nH ftp://ftp.kernel.org/pub/
on the lvm does the trick for me after say 30 seconds max .
[7.] Environment
[7.1.] Software (add the output of the ver_linux script here)
Linux test 2.4.18-ac1 #2 SMP Tue Feb 26 23:13:44 EET 2002 i686 unknown

Gnu C                  2.95.3
Gnu make               3.79.1
binutils               2.11.90.0.19
util-linux             2.11f
mount                  2.11b
modutils               2.4.6
e2fsprogs              1.25
PPP                    2.4.1
Linux C Library        2.2.3
Dynamic linker (ldd)   2.2.3
Procps                 2.0.7
Net-tools              1.60
Kbd                    1.06
Sh-utils               2.0
Modules Loaded
[7.2.] Processor information (from /proc/cpuinfo):
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 3
model name      : Pentium II (Klamath)
stepping        : 4
cpu MHz         : 267.277
cache size      : 512 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov
mmx
bogomips        : 532.48

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 6
model           : 3
model name      : Pentium II (Klamath)
stepping        : 4
cpu MHz         : 267.277
cache size      : 512 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov
mmx
bogomips        : 534.11

[7.3.] Module information (from /proc/modules):

[7.4.] Loaded driver and hardware information (/proc/ioports, /proc/iomem)
0000-001f : dma1
0020-003f : pic1
0040-005f : timer
0060-006f : keyboard
0080-008f : dma page reg
00a0-00bf : pic2
00c0-00df : dma2
00f0-00ff : fpu
0170-0177 : ide1
01f0-01f7 : ide0
02f8-02ff : serial(auto)
0376-0376 : ide1
03c0-03df : vga+
03f6-03f6 : ide0
03f8-03ff : serial(auto)
0cf8-0cff : PCI conf1
b800-b83f : Intel Corp. 82557 [Ethernet Pro 100]
  b800-b83f : eepro100
d000-d0ff : Adaptec AIC-7880U
  d000-d0ff : aic7xxx
d400-d41f : Intel Corp. 82371AB PIIX4 USB
d800-d80f : Intel Corp. 82371AB PIIX4 IDE
  d800-d807 : ide0
02f8-02ff : serial(auto)
0376-0376 : ide1
03c0-03df : vga+
03f6-03f6 : ide0
03f8-03ff : serial(auto)
0cf8-0cff : PCI conf1
b800-b83f : Intel Corp. 82557 [Ethernet Pro 100]
  b800-b83f : eepro100
d000-d0ff : Adaptec AIC-7880U
  d000-d0ff : aic7xxx
d400-d41f : Intel Corp. 82371AB PIIX4 USB
d800-d80f : Intel Corp. 82371AB PIIX4 IDE
  d800-d807 : ide0
  d808-d80f : ide1
e400-e43f : Intel Corp. 82371AB PIIX4 ACPI
e800-e81f : Intel Corp. 82371AB PIIX4 ACPI

00000000-0009fbff : System RAM
0009fc00-0009ffff : reserved
000a0000-000bffff : Video RAM area
000c0000-000c7fff : Video ROM
000c8000-000c8fff : Extension ROM
000cc000-000d0fff : Extension ROM
000f0000-000fffff : System ROM
00100000-1bffcfff : System RAM
  00100000-00279def : Kernel code
  00279df0-002f08ff : Kernel data
1bffd000-1bffefff : ACPI Tables
1bfff000-1bffffff : ACPI Non-volatile Storage
da800000-da8fffff : Intel Corp. 82557 [Ethernet Pro 100]
db000000-db000fff : Intel Corp. 82557 [Ethernet Pro 100]
  db000000-db000fff : eepro100
db800000-db800fff : Adaptec AIC-7880U
  db800000-db800fff : aic7xxx
dc000000-e3cfffff : PCI Bus #01
  dc000000-dfffffff : S3 Inc. 86c368 [Trio 3D/2X]
e3f00000-e3ffffff : PCI Bus #01
e4000000-e7ffffff : Intel Corp. 440LX/EX - 82443LX/EX Host bridge
fec00000-fec00fff : reserved
fee00000-fee00fff : reserved
ffff0000-ffffffff : reserved

[7.5.] PCI information ('lspci -vvv' as root)
00:00.0 Host bridge: Intel Corporation 440LX/EX - 82443LX/EX Host bridge
(rev 03
)
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Step
ping- SERR+ FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort
- <MAbort+ >SERR+ <PERR-
        Latency: 64
        Region 0: Memory at e4000000 (32-bit, prefetchable) [size=64M]
        Capabilities: [a0] AGP version 1.0
                Status: RQ=31 SBA+ 64bit- FW- Rate=x1,x2
                Command: RQ=0 SBA- AGP- 64bit- FW- Rate=<none>

00:01.0 PCI bridge: Intel Corporation 440LX/EX - 82443LX/EX AGP bridge
(rev 03)
(prog-if 00 [Normal decode])
        Control:
...

read more »

 
 
 

assertion failure : ext3 & lvm , 2.4.17 smp & 2.4.18-ac1 smp

Post by Andrew Morto » Thu, 28 Feb 2002 08:40:09



> Assertion failure in do_get_write_access() at transaction.c:730: "(((jh2bh(jh))->b_state & (1UL << BH_Uptodate)) != 0)"

This was fixed in the ext3 patch which went into 2.4.18-pre5

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

assertion failure : ext3 & lvm , 2.4.17 smp & 2.4.18-ac1 smp

Post by Dimitris Zilasko » Thu, 28 Feb 2002 08:50:12




> > Assertion failure in do_get_write_access() at transaction.c:730: "(((jh2bh(jh))->b_state & (1UL << BH_Uptodate)) != 0)"

> This was fixed in the ext3 patch which went into 2.4.18-pre5

well i just got another one

Assertion failure in do_get_write_access() at transaction.c:730:
"(((jh2bh(jh))->b_state & (1UL << BH_Uptodate)) != 0)"
invalid operand: 0000
CPU:    1
EIP:    0010:[<c016297a>]    Not tainted
EFLAGS: 00010286
eax: 0000007b   ebx: d44ffa94   ecx: 00000097   edx: 00000001
esi: cf514de0   edi: d44ffa00   ebp: c49f8f10   esp: d1bebcd4
ds: 0018   es: 0018   ss: 0018
Process wget (pid: 11038, stackpage=d1beb000)
Stack: c0288aa0 c0288eae c0288a80 000002da c0289060 d44ffa00 cf514de0
c49f8f10
       d44ffa94 00000001 00000001 00000000 00000000 dbd3b660 c0162a3d
cf514de0
       c49f8f10 00000000 00000000 00001076 d7272400 d1bebd8c c0157e00
cf514de0
Call Trace: [<c0162a3d>] [<c0157e00>] [<c023de59>] [<c0159b46>]
[<c0159e1f>]
   [<c015a47e>] [<c021e98e>] [<c015a5c6>] [<c01380eb>] [<c013891e>]
[<c015a56c>]
   [<c015aa6d>] [<c015a56c>] [<c01299b0>] [<c015874a>] [<c0135747>]
[<c0106e7b>]

Code: 0f 0b 83 c4 14 90 8b 4d 00 8b 41 38 0f b6 50 25 8b 7d 0c 8b

uname -an :
Linux test 2.4.18-ac1 #2 SMP Tue Feb 26 23:13:44 EET 2002 i686 unknown

Kind regards ,

--
=============================================================================

Dimitris Zilaskos


=============================================================================

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

assertion failure : ext3 & lvm , 2.4.17 smp & 2.4.18-ac1 smp

Post by Andrew Morto » Thu, 28 Feb 2002 09:20:07





> > > Assertion failure in do_get_write_access() at transaction.c:730: "(((jh2bh(jh))->b_state & (1UL << BH_Uptodate)) != 0)"

> > This was fixed in the ext3 patch which went into 2.4.18-pre5

> well i just got another one

> Assertion failure in do_get_write_access() at transaction.c:730:
> "(((jh2bh(jh))->b_state & (1UL << BH_Uptodate)) != 0)"
> ...

> uname -an :
> Linux test 2.4.18-ac1 #2 SMP Tue Feb 26 23:13:44 EET 2002 i686 unknown

blargh.  Possibly LVM tossed back an I/O error and ext3 fed
the result into journal_get_write_access(), which would be
an ext3 bug.

Please prepare a ksymoops trace.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

assertion failure : ext3 & lvm , 2.4.17 smp & 2.4.18-ac1 smp

Post by Dimitris Zilasko » Thu, 28 Feb 2002 09:20:09


there you go , a fresh one :)

ksymoops 2.4.3 on i686 2.4.18-ac1.  Options used
     -V (default)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.4.18-ac1/ (default)
     -m /usr/src/linux/System.map (default)

Warning: You did not tell me where to find symbol information.  I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc.  ksymoops -h explains the options.

Warning (compare_maps): mismatch on symbol partition_name  , ksyms_base says c02202c0, System.map says c01573b0.  Ignoring ksyms_base entry
invalid operand: 0000
CPU:    0
EIP:    0010:[<c01635ea>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010286
eax: 0000007b   ebx: d5a8a294   ecx: 00000002   edx: 00000001
esi: d3a743c0   edi: d5a8a200   ebp: d7701d30   esp: d6713cd4
ds: 0018   es: 0018   ss: 0018
Process wget (pid: 456, stackpage=d6713000)
Stack: c02a7080 c02a748e c02a7060 000002da c02a7640 d5a8a200 d3a743c0 d7701d30
       d5a8a294 00000001 00000001 00000000 00000000 db800740 c01636ad d3a743c0
       d7701d30 00000000 00000000 000004e8 d7990c00 d6713d8c c0158a70 d3a743c0
Call Trace: [<c01636ad>] [<c0158a70>] [<c0252050>] [<c015a7b6>] [<c015aa8f>]
   [<c011a94f>] [<c0239349>] [<c015b0ee>] [<c015b236>] [<c013902b>] [<c013985e>]
   [<c015b1dc>] [<c015b6dd>] [<c015b1dc>] [<c01299b0>] [<c01593ba>] [<c0136717>]
   [<c0106e7b>]
Code: 0f 0b 83 c4 14 90 8b 4d 00 8b 41 38 0f b6 50 25 8b 7d 0c 8b

Quote:>>EIP; c01635ea <do_get_write_access+4b6/540>   <=====

Trace; c01636ac <journal_get_write_access+38/5c>
Trace; c0158a70 <ext3_new_block+478/774>
Trace; c0252050 <ip_rcv_finish+0/21e>
Trace; c015a7b6 <ext3_alloc_block+1e/24>
Trace; c015aa8e <ext3_alloc_branch+3e/288>
Trace; c011a94e <do_softirq+6e/cc>
Trace; c0239348 <_text_lock_netfilter+c0/e8>
Trace; c015b0ee <ext3_get_block_handle+1ca/2b8>
Trace; c015b236 <ext3_get_block+5a/60>
Trace; c013902a <__block_prepare_write+da/2dc>
Trace; c013985e <block_prepare_write+22/3c>
Trace; c015b1dc <ext3_get_block+0/60>
Trace; c015b6dc <ext3_prepare_write+d4/1c4>
Trace; c015b1dc <ext3_get_block+0/60>
Trace; c01299b0 <generic_file_write+488/758>
Trace; c01593ba <ext3_file_write+46/4c>
Trace; c0136716 <sys_write+8e/100>
Trace; c0106e7a <system_call+32/38>
Code;  c01635ea <do_get_write_access+4b6/540>
0000000000000000 <_EIP>:
Code;  c01635ea <do_get_write_access+4b6/540>   <=====
   0:   0f 0b                     ud2a      <=====
Code;  c01635ec <do_get_write_access+4b8/540>
   2:   83 c4 14                  add    $0x14,%esp
Code;  c01635ee <do_get_write_access+4ba/540>
   5:   90                        nop
Code;  c01635f0 <do_get_write_access+4bc/540>
   6:   8b 4d 00                  mov    0x0(%ebp),%ecx
Code;  c01635f2 <do_get_write_access+4be/540>
   9:   8b 41 38                  mov    0x38(%ecx),%eax
Code;  c01635f6 <do_get_write_access+4c2/540>
   c:   0f b6 50 25               movzbl 0x25(%eax),%edx
Code;  c01635fa <do_get_write_access+4c6/540>
  10:   8b 7d 0c                  mov    0xc(%ebp),%edi
Code;  c01635fc <do_get_write_access+4c8/540>
  13:   8b 00                     mov    (%eax),%eax

2 warnings issued.  Results may not be reliable.

 kind regards
--
=============================================================================

Dimitris Zilaskos


=============================================================================

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

1. ext3 assertion failure and oops, 2.4.18

I can reliably reproduce an assertion failure and oops in ext3 by simply
restarting cyrus21, if directories used by cyrus have +j flag set with
chattr. Filesystem was mounted with default journalling mode data=orderded,
kernels tested were 2.4.18 and 2.4.19-pre3-ac4. Recent -pre or -ac kernels
wouldn't compile with my .config.

Assertion failure in journal_revoke() at revoke.c:330:
"!(__builtin_constant_p(BH_Revoked) ? constant_test_bit((BH_Revoked),(
&bh->b_state)) : variable_test_bit((BH_Revoked),( &bh->b_state)))"

I'll capture whole oops if requested.

I found two similar cases from lkml archives, but they were left
unresponded (atleast lkml wasn't cc'ed). I couldn't judge from changelogs if
this problem has been already fixed.

http://groups.google.com/groups?selm=2446DD7E.7F1AEC90.00A5E169%40net...
http://groups.google.com/groups?as_umsgid=%3C2446DD7E.7F1AEC90.00A5E1...

--
Antti Salmela
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

2. Send me HOWTO's

3. Oops/Crash with 2.4.17 and 2.4.18 kernels

4. xmailtool under Solaris2.3

5. Problem with I2O block Driver in 2.4.17 and 2.4.18-pre9

6. HOW TO INSTALL QT???

7. 2.4.17 to 2.4.18 broke networking syn's?

8. Program to Program communication?

9. 2.4.18 vs 2.4.17 DVD video preformance problem?

10. 2.4.17,2.4.18 ide-scsi+usb-storage+devfs Oops

11. 2.4.18-pre8 + 2.4.17-pre8-ac3 + rmap12c + XFS Results

12. Network performance i 2.4.18 compared to 2.4.17

13. Ethernet/SCSI/PCI problems when enabling SMP on 2.4.17: VP6, aix7xxx & 3c595