We've got a few Dell PowerEdge 2650 machines, and thought they would
become nice fileservers, and we installed RedHat Linux 7.3 on them.
So far, so good; after the installation, pretty much was downhill
from there. With RedHat's 2.4.18-3 and 2.4.18-5 kernel we detect
all disks connected through our QLogic FC 2200 HBA's, with
vanilla 2.4.18 and 2.4.19-rc2, we detect nothing; and we've tried
Qlogic's 6.0beta13 and 6.1beta2 drivers, as well as the driver
that comes with redhat's release. We're currently running an almost
identical configuration, only diff. is one HBA pr server, and
the servers are 2550's and not 2650's.
Ok, to sum up problems:
With redhat kernels:
* Disks found, _but_ after about 2-3 mins with heavy I/O on
FC HBA's the machine dies, and only thing working is cold boot
With vanilla kernels:
* Disks not found, so we don't know about I/O problems.
Anyone have any ideas?
Here's dmesg and lspci -vvvxx, if anything else is needed, please
tell me, and I'll provide you with the info:
test4:~# dmesg
idx=8 mapped at ffff6000
ACPI table found: APIC v1 [DELL PE2650 0.1]
__va_range(0xfdd18, 0x88): idx=8 mapped at ffff6000
LAPIC (acpi_id[0x0001] id[0x0] enabled[1])
CPU 0 (0x0000) enabledProcessor #0 Unknown CPU [15:2] APIC version 16
LAPIC (acpi_id[0x0002] id[0x2] enabled[1])
CPU 1 (0x0200) enabledProcessor #2 Unknown CPU [15:2] APIC version 16
LAPIC (acpi_id[0x0003] id[0x1] enabled[1])
CPU 2 (0x0100) enabledProcessor #1 Unknown CPU [15:2] APIC version 16
LAPIC (acpi_id[0x0004] id[0x3] enabled[1])
CPU 3 (0x0300) enabledProcessor #3 Unknown CPU [15:2] APIC version 16
IOAPIC (id[0x4] address[0xfec00000] global_irq_base[0x0])
IOAPIC (id[0x5] address[0xfec01000] global_irq_base[0x10])
IOAPIC (id[0x6] address[0xfec02000] global_irq_base[0x20])
LAPIC_NMI (acpi_id[0x0001] polarity[0x1] trigger[0x1] lint[0x1])
LAPIC_NMI (acpi_id[0x0002] polarity[0x1] trigger[0x1] lint[0x1])
LAPIC_NMI (acpi_id[0x0003] polarity[0x1] trigger[0x1] lint[0x1])
LAPIC_NMI (acpi_id[0x0004] polarity[0x1] trigger[0x1] lint[0x1])
4 CPUs total
Local APIC address fee00000
__va_range(0xfdda0, 0x24): idx=8 mapped at ffff6000
__va_range(0xfdda0, 0x50): idx=8 mapped at ffff6000
ACPI table found: SPCR v1 [DELL PE2650 0.1]
Enabling the CPU's according to the ACPI table
Intel MultiProcessor Specification v1.4
Virtual Wire compatibility mode.
OEM ID: DELL Product ID: PE 0121 APIC at: 0xFEE00000
I/O APIC #4 Version 17 at 0xFEC00000.
I/O APIC #5 Version 17 at 0xFEC01000.
I/O APIC #6 Version 17 at 0xFEC02000.
Processors: 4
Kernel command line: auto BOOT_IMAGE=linux ro root=802 BOOT_FILE=/boot/bzImage-2.4.18-3 max_scsi_luns=128
Initializing CPU#0
Detected 1794.244 MHz processor.
Console: colour VGA+ 80x25
Calibrating delay loop... 3578.26 BogoMIPS
Memory: 2065004k/2097088k available (1578k kernel code, 31700k reserved, 469k data, 220k init, 1179584k highmem)
Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes)
Inode cache hash table entries: 131072 (order: 8, 1048576 bytes)
Mount-cache hash table entries: 32768 (order: 6, 262144 bytes)
Buffer cache hash table entries: 131072 (order: 7, 524288 bytes)
Page-cache hash table entries: 524288 (order: 9, 2097152 bytes)
CPU: Before vendor init, caps: 3febfbff 00000000 00000000, vendor = 0
CPU: L1 I cache: 12K, L1 D cache: 8K
CPU: L2 cache: 512K
CPU: Physical Processor ID: 0
CPU: After vendor init, caps: 3febfbff 00000000 00000000 00000000
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU: After generic, caps: 3febfbff 00000000 00000000 00000000
CPU: Common caps: 3febfbff 00000000 00000000 00000000
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
mtrr: v1.40 (20010327) Richard Gooch (rgo...@atnf.csiro.au)
mtrr: detected mtrr type: Intel
CPU: Before vendor init, caps: 3febfbff 00000000 00000000, vendor = 0
CPU: L1 I cache: 12K, L1 D cache: 8K
CPU: L2 cache: 512K
CPU: Physical Processor ID: 0
CPU: After vendor init, caps: 3febfbff 00000000 00000000 00000000
Intel machine check reporting enabled on CPU#0.
CPU: After generic, caps: 3febfbff 00000000 00000000 00000000
CPU: Common caps: 3febfbff 00000000 00000000 00000000
CPU0: Intel(R) XEON(TM) CPU 1.80GHz stepping 04
per-CPU timeslice cutoff: 1462.89 usecs.
task migration cache decay timeout: 10 msecs.
enabled ExtINT on CPU#0
ESR value before enabling vector: 00000040
ESR value after enabling vector: 00000000
Booting processor 1/1 eip 2000
Initializing CPU#1
masked ExtINT on CPU#1
ESR value before enabling vector: 00000000
ESR value after enabling vector: 00000000
Calibrating delay loop... 3578.26 BogoMIPS
CPU: Before vendor init, caps: 3febfbff 00000000 00000000, vendor = 0
CPU: L1 I cache: 12K, L1 D cache: 8K
CPU: L2 cache: 512K
CPU: Physical Processor ID: 0
CPU: After vendor init, caps: 3febfbff 00000000 00000000 00000000
Intel machine check reporting enabled on CPU#1.
CPU: After generic, caps: 3febfbff 00000000 00000000 00000000
CPU: Common caps: 3febfbff 00000000 00000000 00000000
CPU1: Intel(R) XEON(TM) CPU 1.80GHz stepping 04
Booting processor 2/2 eip 2000
Initializing CPU#2
masked ExtINT on CPU#2
ESR value before enabling vector: 00000000
ESR value after enabling vector: 00000000
Calibrating delay loop... 3578.26 BogoMIPS
CPU: Before vendor init, caps: 3febfbff 00000000 00000000, vendor = 0
CPU: L1 I cache: 12K, L1 D cache: 8K
CPU: L2 cache: 512K
CPU: Physical Processor ID: 3
CPU: After vendor init, caps: 3febfbff 00000000 00000000 00000000
Intel machine check reporting enabled on CPU#2.
CPU: After generic, caps: 3febfbff 00000000 00000000 00000000
CPU: Common caps: 3febfbff 00000000 00000000 00000000
CPU2: Intel(R) XEON(TM) CPU 1.80GHz stepping 04
Booting processor 3/3 eip 2000
Initializing CPU#3
masked ExtINT on CPU#3
ESR value before enabling vector: 00000000
ESR value after enabling vector: 00000000
Calibrating delay loop... 3578.26 BogoMIPS
CPU: Before vendor init, caps: 3febfbff 00000000 00000000, vendor = 0
CPU: L1 I cache: 12K, L1 D cache: 8K
CPU: L2 cache: 512K
CPU: Physical Processor ID: 3
CPU: After vendor init, caps: 3febfbff 00000000 00000000 00000000
Intel machine check reporting enabled on CPU#3.
CPU: After generic, caps: 3febfbff 00000000 00000000 00000000
CPU: Common caps: 3febfbff 00000000 00000000 00000000
CPU3: Intel(R) XEON(TM) CPU 1.80GHz stepping 04
Total of 4 processors activated (14313.06 BogoMIPS).
cpu_sibling_map[0] = 1
cpu_sibling_map[1] = 0
cpu_sibling_map[2] = 3
cpu_sibling_map[3] = 2
ENABLING IO-APIC IRQs
Setting 4 in the phys_id_present_map
...changing IO-APIC physical APIC ID to 4 ... ok.
Setting 5 in the phys_id_present_map
...changing IO-APIC physical APIC ID to 5 ... ok.
Setting 6 in the phys_id_present_map
...changing IO-APIC physical APIC ID to 6 ... ok.
init IO_APIC IRQs
IO-APIC (apicid-pin) 4-0, 4-7, 4-10, 4-11, 4-13, 6-0, 6-1, 6-2, 6-3, 6-4, 6-5, 6-6, 6-7, 6-8, 6-9, 6-10, 6-11, 6-12, 6-13,
6-14, 6-15 not connected.
..TIMER: vector=0x31 pin1=2 pin2=0
..MP-BIOS bug: 8254 timer not connected to IO-APIC
...trying to set up timer (IRQ0) through the 8259A ...
..... (found pin 0) ...works.
number of MP IRQ sources: 35.
number of IO-APIC #4 registers: 16.
number of IO-APIC #5 registers: 16.
number of IO-APIC #6 registers: 16.
testing the IO APIC.......................
IO APIC #4......
.... register #00: 04000000
....... : physical APIC id: 04
.... register #01: 000F0011
....... : max redirection entries: 000F
....... : PRQ implemented: 0
....... : IO APIC version: 0011
.... register #02: 04000000
....... : arbitration: 04
.... IRQ redirection table:
NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect:
00 00F 0F 0 0 0 0 0 1 1 31
01 00F 0F 0 0 0 0 0 1 1 39
02 000 00 1 0 0 0 0 0 0 00
03 00F 0F 0 0 0 0 0 1 1 41
04 00F 0F 0 0 0 0 0 1 1 49
05 00F 0F 1 1 0 1 0 1 1 51
06 00F 0F 0 0 0 0 0 1 1 59
07 000 00 1 0 0 0 0 0 0 00
08 00F 0F 0 0 0 0 0 1 1 61
09 00F 0F 0 0 0 0 0 1 1 69
0a 000 00 1 0 0 0 0 0 0 00
0b 000 00 1 0 0 0 0 0 0 00
0c 00F 0F 0 0 0 0 0 1 1 71
0d 000 00 1 0 0 0 0 0 0 00
0e 00F 0F 0 0 0 0 0 1 1 79
0f 00F 0F 0 0 0 0 0 1 1 81
IO APIC #5......
.... register #00: 05000000
....... : physical APIC id: 05
.... register #01: 000F0011
....... : max redirection entries: 000F
....... : PRQ implemented: 0
....... : IO APIC version: 0011
.... register #02: 05000000
....... : arbitration: 05
.... IRQ redirection table:
NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect:
00 00F 0F 1 1 0 1 0 1 1 89
01 00F 0F 1 1 0 1 0 1 1 91
02 00F 0F 1 1 0 1 0 1 1 99
03 00F 0F 1 1 0 1 0 1 1 A1
04 00F 0F 1 1 0 1 0 1 1 A9
05 00F 0F 1 1 0 1 0 1 1 B1
06 00F 0F 1 1 0 1 0 1 1 B9
07 00F 0F 1 1 0 1 0 1 1 C1
08 00F 0F 1 1 0 1 0 1 1 C9
09 00F 0F 1 1 0 1 0 1 1 D1
0a 00F 0F 1 1 0 1 0 1 1 D9
0b 00F 0F 1 1 0 1 0 1 1 E1
0c 00F 0F 1 1 0 1 0 1 1 E9
0d 00F 0F 1 1 0 1 0 1 1 32
0e 00F 0F 1 1 0 1 0 1 1 3A
0f 00F 0F 1 1 0 1 0 1 1 42
IO APIC #6......
.... register #00: 06000000
....... : physical APIC id: 06
.... register #01: 000F0011
....... : max redirection entries: 000F
....... : PRQ implemented: 0
....... : IO APIC version: 0011
.... register #02: 06000000
.......
...
read more »