weird md+scsi prob (delayed or missing partition check, md autorun finds no RAID)

weird md+scsi prob (delayed or missing partition check, md autorun finds no RAID)

Post by Raimund Sachere » Sat, 09 Feb 2002 05:14:56



Hi folks,

we encountered a really strange problem here. We have several
software-RAID setups around, all of them working fine. Some of the are
IDE-based, some consist of SCSI drives.

We have recently set up a RAID5 configuration with four SCSI devices. The
kernel boots fine from a (RAID1-it does not really matter) /boot
partition using standard lilo, without initrd. Then it scans
the scsi bus, lists all hard drives. Then the md part begins and we
expect the autodetect function to find the persistent superblocks.
Unfortunately, the partition check on the scsi drives takes place later
than the md autodetect code. With IDE drives, the partition check takes
place immediately after detecting the drives - everything works fine. Not
with SCSI: first the scsi driver gets loaded, then the md runs an
autodetect and much later, the first time an access to a partition is
needed, the partitions get detected and listed - much too late.

we find this behaviour quite strange, which we can reproduce even when
loading the scsi driver as modules.

This shows how it should work: (kernel 2.4.0)
[...]
Uniform Multi-Platform E-IDE driver Revision: 6.31
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: IDE controller on PCI bus 00 dev 39
VP_IDE: chipset revision 16
VP_IDE: not 100% native mode: will probe irqs later
VP_IDE: VIA vt82c686a IDE UDMA66 controller on pci0:7.1
    ide0: BM-DMA at 0xd000-0xd007, BIOS settings: hda:DMA, hdb:pio
    ide1: BM-DMA at 0xd008-0xd00f, BIOS settings: hdc:pio, hdd:pio
hda: 50X CD-ROM, ATAPI CD/DVD-ROM drive
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: ATAPI 40X CD-ROM drive, 128kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.12
FDC 0 is a post-1991 82077
LVM version 0.9  by Heinz Mauelshagen  (13/11/2000)
lvm -- Driver successfully initialized
Serial driver version 5.02 (2000-08-09) with HUB-6 MANY_PORTS MULTIPORT SHARE_IRQ SERIAL_PCI enabled
ttyS00 at 0x03f8 (irq = 4) is a 16550A
ttyS01 at 0x02f8 (irq = 3) is a 16550A
Real Time Clock Driver v1.10d
SCSI subsystem driver Revision: 1.00
PCI: Found IRQ 11 for device 00:0a.0
i91u: PCI Base=0xDC00, IRQ=11, BIOS=0xCC000, SCSI ID=7
i91u: Reset SCSI Bus ...
scsi0 : Initio INI-9X00U/UW SCSI device driver; Revision: 1.03g
  Vendor: QUANTUM   Model: ATLAS_V_18_WLS    Rev: 0230
  Type:   Direct-Access                      ANSI SCSI revision: 03
  Vendor: QUANTUM   Model: ATLAS_V_18_WLS    Rev: 0230
  Type:   Direct-Access                      ANSI SCSI revision: 03
  Vendor: HP        Model: HP35480A          Rev: T603
  Type:   Sequential-Access                  ANSI SCSI revision: 02
  Vendor: YAMAHA    Model: CDR100            Rev: 1.11
  Type:   WORM                               ANSI SCSI revision: 02
Detected scsi disk sda at scsi0, channel 0, id 0, lun 0
Detected scsi disk sdb at scsi0, channel 0, id 1, lun 0
SCSI device sda: 35861388 512-byte hdwr sectors (18361 MB)
Partition check:
 sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 >
SCSI device sdb: 35861388 512-byte hdwr sectors (18361 MB)
 sdb: sdb1 sdb2 sdb3 sdb4 < sdb5 sdb6 sdb7 >
Detected scsi CD-ROM sr0 at scsi0, channel 0, id 5, lun 0
sr-1070120960: scsi-1 drive
linear personality registered
raid0 personality registered
raid1 personality registered
raid5 personality registered
raid5: measuring checksumming speed
   8regs     :  1378.400 MB/sec
   32regs    :  1190.400 MB/sec
   pII_mmx   :  2108.800 MB/sec
   p5_mmx    :  2697.600 MB/sec
raid5: using function: p5_mmx (2697.600 MB/sec)
md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27
md.c: sizeof(mdp_super_t) = 4096
autodetecting RAID arrays
(read) sda1's sb offset: 24000 [events: 00000041]
(read) sda3's sb offset: 530048 [events: 0000004b]
(read) sda5's sb offset: 2048192 [events: 0000004b]
(read) sda6's sb offset: 10241280 [events: 0000004b]
(read) sda7's sb offset: 4819392 [events: 0000004b]
(read) sdb1's sb offset: 24000 [events: 00000041]
(read) sdb3's sb offset: 530048 [events: 0000004b]
(read) sdb5's sb offset: 2048192 [events: 0000004b]
(read) sdb6's sb offset: 10241280 [events: 0000004b]
(read) sdb7's sb offset: 4819392 [events: 0000004b]
autorun ...
considering sdb7 ...
  adding sdb7 ...
  adding sda7 ...
created md3
bind<sda7,1>
bind<sdb7,2>
running: <sdb7><sda7>
now!
sdb7's event counter: 0000004b
sda7's event counter: 0000004b
md3: max total readahead window set to 124k
md3: 1 data-disks, max readahead per data-disk: 124k
raid1: device sdb7 operational as mirror 0
raid1: device sda7 operational as mirror 1
(checking disk 0)
[...]
raid1: raid set md4 active with 2 out of 2 mirrors
md: updating md4 RAID superblock on device
sdb1 [events: 00000042](write) sdb1's sb offset: 24000
sda1 [events: 00000042](write) sda1's sb offset: 24000
.
... autorun DONE.

This is how it does not work (kernel 2.4.17):
[...]
Uniform Multi-Platform E-IDE driver Revision: 6.31
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
PIIX3: IDE controller on PCI bus 00 dev 09
PIIX3: chipset revision 0
PIIX3: not 100% native mode: will probe irqs later
    ide0: BM-DMA at 0xe800-0xe807, BIOS settings: hda:pio, hdb:pio
hda: Maxtor 90871U2, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: 16992864 sectors (8700 MB) w/512KiB Cache, CHS=1057/255/63, (U)DMA
Partition check:
 /dev/ide/host0/bus0/target0/lun0: p1 p2 p3 p4 < p5 p6 >
loop: loaded (max 8 devices)
SCSI subsystem driver Revision: 1.00
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.4
        <Adaptec 2930 Ultra2 SCSI adapter>
        aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/253 SCBs

  Vendor: UNISYS    Model: 003665MAB3045SP   Rev: 0608
  Type:   Direct-Access                      ANSI SCSI revision: 02
  Vendor: IBM       Model: DDRS-34560W       Rev: S97B
  Type:   Direct-Access                      ANSI SCSI revision: 02
  Vendor: SEAGATE   Model: ST34520N          Rev: 1206
  Type:   Direct-Access                      ANSI SCSI revision: 02
  Vendor: IBM       Model: DCAS-34330        Rev: S65A
  Type:   Direct-Access                      ANSI SCSI revision: 02
  Vendor: ARCHIVE   Model: Python 04106-XXX  Rev: 727A
  Type:   Sequential-Access                  ANSI SCSI revision: 02
scsi0:A:0:0: Tagged Queuing enabled.  Depth 253
scsi0:A:1:0: Tagged Queuing enabled.  Depth 253
scsi0:A:2:0: Tagged Queuing enabled.  Depth 253
scsi0:A:3:0: Tagged Queuing enabled.  Depth 253
md: linear personality registered as nr 1
md: raid0 personality registered as nr 2
md: raid1 personality registered as nr 3
md: raid5 personality registered as nr 4
raid5: measuring checksumming speed
   8regs     :   330.000 MB/sec
   32regs    :   236.000 MB/sec
raid5: using function: 8regs (330.000 MB/sec)
md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
LVM version 1.0.1-rc4(ish)(03/10/2001)
[...]

after doing a manual fdisk -lu or whatever:

Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
Attached scsi disk sdb at scsi0, channel 0, id 1, lun 0
Attached scsi disk sdc at scsi0, channel 0, id 2, lun 0
Attached scsi disk sdd at scsi0, channel 0, id 3, lun 0
(scsi0:A:0): 40.000MB/s transfers (20.000MHz, offset 31, 16bit)
SCSI device sda: 8895370 512-byte hdwr sectors (4554 MB)
 /dev/scsi/host0/bus0/target0/lun0: p1 p2 p3 p4 < p5 p6 >
(scsi0:A:1): 40.000MB/s transfers (20.000MHz, offset 15, 16bit)
SCSI device sdb: 8925000 512-byte hdwr sectors (4570 MB)
 /dev/scsi/host0/bus0/target1/lun0: p1 p2 p3 p4 < p5 p6 >
(scsi0:A:2): 20.000MB/s transfers (20.000MHz, offset 15)
SCSI device sdc: 8888924 512-byte hdwr sectors (4551 MB)
 /dev/scsi/host0/bus0/target2/lun0: p1 p2 p3 p4 < p5 p6 >
(scsi0:A:3): 20.000MB/s transfers (20.000MHz, offset 15)
SCSI device sdd: 8467200 512-byte hdwr sectors (4335 MB)
 /dev/scsi/host0/bus0/target3/lun0: p1 p2 p3 p4 < p5 >
[...]

IMHO at least the md autodetect run should trigger a partition check on
all scsi drivers.

Can you give us a hint whether this is broken in the recent kernel or we
made a mistake? There are tons of people out there using md devices, many
of them with SCSI drives. The RAID partitions get all perfectly
recognized by the md driver _after_ a partition check has taken place.

Thanks for your help

Ray & Mark