2.5.41-mm1 panics on boot, 2.5.41-vanilla OK

2.5.41-mm1 panics on boot, 2.5.41-vanilla OK

Post by Miquel van Smoorenbur » Thu, 10 Oct 2002 19:40:05



As per subject: 2.5.41-mm1 panics on boot, 2.5.41-vanilla OK.

-mm1 panics at the point where -vanilla says "Initializing RT netlink socket".

This is an SMP machine, with highmem turned on. And serial console.
.config attached below.

Linux version 2.5.41 (root@wormhole) (gcc version 2.95.4 20011006 (Debian prerelease)) #5 SMP Wed Oct 9 17:48:04 CEST 2002
[..]
128MB HIGHMEM available.
896MB LOWMEM available.
[..]
Linux NET4.0 for Linux 2.4
Based upon Swansea University Computer Society NET3.039
Unable to handle kernel NULL pointer dereference at virtual address 00000000
 printing eip:
c0132ca4
*pde = 00000000
Oops: 0000

CPU:    1
EIP:    0060:[<c0132ca4>]    Not tainted
EFLAGS: 00010002
EIP is at kmem_cache_alloc+0x18/0x48
eax: 00000004   ebx: 00000246   ecx: c02cfdc0   edx: 00000000
esi: 00000138   edi: 00000000   ebp: 00000000   esp: f7fa5f80
ds: 0068   es: 0068   ss: 0068
Process swapper (pid: 1, threadinfo=f7fa4000 task=f7fc7020)
Stack: f7fa4000 c0131bf8 c02cfdc0 000001d0 f7fa4000 00000000 00000000 00000000
       f7fa4000 ffffe1e5 c0118a4c c040e81b 00000246 c02ce540 0000003b c0324d0a
       c02ad131 00000138 00000000 00002000 00000000 00000000 c0324cb2 c0310862
Call Trace:
 [<c0131bf8>] kmem_cache_create+0x6c/0x5c4
 [<c0118a4c>] release_console_sem+0xa4/0xdc
 [<c01050ab>] init+0x47/0x1ac
 [<c0105064>] init+0x0/0x1ac
 [<c01054ed>] kernel_thread_helper+0x5/0xc

Code: 8b 02 85 c0 74 16 c7 42 0c 01 00 00 00 48 89 02 8b 44 82 10
 <0>Kernel panic: Attempted to kill init!
 <0>Rebooting in 30 seconds..

Decoded:

Code: 8b 02 85 c0 74 16 c7 42 0c 01 00 00 00 48 89 02 8b 44 82 10

>>EIP; c0132ca4 <kmalloc+10c/1c4>   <=====
>>ecx; c02cfdc0 <root_user+0/18>
>>esp; f7fa5f80 <END_OF_CODE+37b2b138/????>

Code;  c0132ca4 <kmalloc+10c/1c4>
00000000 <_EIP>:
Code;  c0132ca4 <kmalloc+10c/1c4>   <=====
   0:   8b 02                     mov    (%edx),%eax   <=====
Code;  c0132ca6 <kmalloc+10e/1c4>
   2:   85 c0                     test   %eax,%eax
Code;  c0132ca8 <kmalloc+110/1c4>
   4:   74 16                     je     1c <_EIP+0x1c> c0132cc0 <kmalloc+128/1c4>
Code;  c0132caa <kmalloc+112/1c4>
   6:   c7 42 0c 01 00 00 00      movl   $0x1,0xc(%edx)
Code;  c0132cb1 <kmalloc+119/1c4>
   d:   48                        dec    %eax
Code;  c0132cb2 <kmalloc+11a/1c4>
   e:   89 02                     mov    %eax,(%edx)
Code;  c0132cb4 <kmalloc+11c/1c4>
  10:   8b 44 82 10               mov    0x10(%edx,%eax,4),%eax

 <0>Kernel panic: Attempted to kill init!

#
# Automatically generated make config: don't edit
#
CONFIG_X86=y
CONFIG_ISA=y
# CONFIG_SBUS is not set
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y

#
# General setup
#
CONFIG_NET=y
CONFIG_SYSVIPC=y
# CONFIG_BSD_PROCESS_ACCT is not set
CONFIG_SYSCTL=y

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODVERSIONS=y
CONFIG_KMOD=y

#
# Processor type and features
#
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
CONFIG_M686=y
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MELAN is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MCYRIXIII is not set
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
# CONFIG_RWSEM_GENERIC_SPINLOCK is not set
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_X86_L1_CACHE_SHIFT=5
CONFIG_X86_TSC=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_PPRO_FENCE=y
# CONFIG_HUGETLB_PAGE is not set
CONFIG_SMP=y
CONFIG_PREEMPT=y
# CONFIG_X86_NUMA is not set
# CONFIG_X86_MCE is not set
# CONFIG_X86_MCE_NONFATAL is not set
# CONFIG_X86_MCE_P4THERMAL is not set
# CONFIG_CPU_FREQ is not set
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
# CONFIG_MICROCODE is not set
# CONFIG_X86_MSR is not set
# CONFIG_X86_CPUID is not set
# CONFIG_NOHIGHMEM is not set
CONFIG_HIGHMEM4G=y
# CONFIG_HIGHMEM64G is not set
CONFIG_HIGHMEM=y
# CONFIG_HIGHPTE is not set
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
CONFIG_HAVE_DEC_LOCK=y

#
# Power management options (ACPI, APM)
#

#
# ACPI Support
#
CONFIG_ACPI=y
# CONFIG_ACPI_HT_ONLY is not set
CONFIG_ACPI_BOOT=y
# CONFIG_ACPI_SLEEP is not set
# CONFIG_ACPI_AC is not set
# CONFIG_ACPI_BATTERY is not set
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_FAN=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_THERMAL=y
# CONFIG_ACPI_TOSHIBA is not set
# CONFIG_ACPI_DEBUG is not set
CONFIG_ACPI_BOOT=y
CONFIG_ACPI_BUS=y
CONFIG_ACPI_INTERPRETER=y
CONFIG_ACPI_EC=y
CONFIG_ACPI_POWER=y
CONFIG_ACPI_PCI=y
# CONFIG_ACPI_SLEEP is not set
CONFIG_ACPI_SYSTEM=y
# CONFIG_PM is not set
# CONFIG_APM is not set

#
# Bus options (PCI, PCMCIA, EISA, MCA, ISA)
#
CONFIG_X86_IO_APIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
# CONFIG_SCx200 is not set
CONFIG_PCI_NAMES=y
# CONFIG_EISA is not set
# CONFIG_MCA is not set
CONFIG_HOTPLUG=y

#
# PCMCIA/CardBus support
#
# CONFIG_PCMCIA is not set

#
# PCI Hotplug Support
#
# CONFIG_HOTPLUG_PCI is not set
# CONFIG_HOTPLUG_PCI_COMPAQ is not set
# CONFIG_HOTPLUG_PCI_COMPAQ_NVRAM is not set
# CONFIG_HOTPLUG_PCI_IBM is not set

#
# Executable file formats
#
CONFIG_KCORE_ELF=y
# CONFIG_KCORE_AOUT is not set
CONFIG_BINFMT_AOUT=m
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_MISC=m

#
# Memory Technology Devices (MTD)
#
# CONFIG_MTD is not set

#
# Parallel port support
#
# CONFIG_PARPORT is not set

#
# Plug and Play configuration
#
# CONFIG_PNP is not set
# CONFIG_ISAPNP is not set
# CONFIG_PNPBIOS is not set

#
# Block devices
#
CONFIG_BLK_DEV_FD=y
# CONFIG_BLK_DEV_XD is not set
# CONFIG_PARIDE is not set
# CONFIG_BLK_CPQ_DA is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_CISS_SCSI_TAPE is not set
# CONFIG_BLK_DEV_DAC960 is not set
# CONFIG_BLK_DEV_UMEM is not set
CONFIG_BLK_DEV_LOOP=m
# CONFIG_BLK_DEV_NBD is not set
CONFIG_BLK_DEV_RAM=m
CONFIG_BLK_DEV_RAM_SIZE=4096
# CONFIG_BLK_DEV_INITRD is not set

#
# ATA/ATAPI/MFM/RLL device support
#
CONFIG_IDE=y

#
# IDE, ATA and ATAPI Block devices
#
CONFIG_BLK_DEV_IDE=y

#
# Please see Documentation/ide.txt for help/info on IDE drives
#
# CONFIG_BLK_DEV_HD_IDE is not set
# CONFIG_BLK_DEV_HD is not set
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_IDEDISK_MULTI_MODE=y
# CONFIG_IDEDISK_STROKE is not set
# CONFIG_BLK_DEV_IDECS is not set
CONFIG_BLK_DEV_IDECD=y
# CONFIG_BLK_DEV_IDEFLOPPY is not set
# CONFIG_BLK_DEV_IDESCSI is not set
# CONFIG_IDE_TASK_IOCTL is not set

#
# IDE chipset support/bugfixes
#
# CONFIG_BLK_DEV_CMD640 is not set
# CONFIG_BLK_DEV_CMD640_ENHANCED is not set
# CONFIG_BLK_DEV_ISAPNP is not set
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_BLK_DEV_GENERIC=y
CONFIG_IDEPCI_SHARE_IRQ=y
CONFIG_BLK_DEV_IDEDMA_PCI=y
# CONFIG_BLK_DEV_OFFBOARD is not set
# CONFIG_BLK_DEV_IDEDMA_FORCED is not set
CONFIG_IDEDMA_PCI_AUTO=y
# CONFIG_IDEDMA_ONLYDISK is not set
CONFIG_BLK_DEV_IDEDMA=y
# CONFIG_IDEDMA_PCI_WIP is not set
# CONFIG_IDEDMA_NEW_DRIVE_LISTINGS is not set
CONFIG_BLK_DEV_ADMA=y
# CONFIG_BLK_DEV_AEC62XX is not set
# CONFIG_BLK_DEV_ALI15X3 is not set
# CONFIG_WDC_ALI15X3 is not set
# CONFIG_BLK_DEV_AMD74XX is not set
# CONFIG_AMD74XX_OVERRIDE is not set
# CONFIG_BLK_DEV_CMD64X is not set
# CONFIG_BLK_DEV_CY82C693 is not set
# CONFIG_BLK_DEV_CS5530 is not set
# CONFIG_BLK_DEV_HPT34X is not set
# CONFIG_HPT34X_AUTODMA is not set
# CONFIG_BLK_DEV_HPT366 is not set
CONFIG_BLK_DEV_PIIX=y
# CONFIG_BLK_DEV_NFORCE is not set
# CONFIG_BLK_DEV_NS87415 is not set
# CONFIG_BLK_DEV_OPTI621 is not set
# CONFIG_BLK_DEV_PDC202XX_OLD is not set
# CONFIG_PDC202XX_BURST is not set
# CONFIG_BLK_DEV_PDC202XX_NEW is not set
# CONFIG_PDC202XX_FORCE is not set
# CONFIG_BLK_DEV_RZ1000 is not set
# CONFIG_BLK_DEV_SVWKS is not set
# CONFIG_BLK_DEV_SIIMAGE is not set
# CONFIG_BLK_DEV_SIS5513 is not set
# CONFIG_BLK_DEV_SLC90E66 is not set
# CONFIG_BLK_DEV_TRM290 is not set
# CONFIG_BLK_DEV_VIA82CXXX is not set
# CONFIG_IDE_CHIPSETS is not set
CONFIG_IDEDMA_AUTO=y
# CONFIG_IDEDMA_IVB is not set
# CONFIG_DMA_NONPCI is not set
CONFIG_BLK_DEV_IDE_MODES=y

#
# SCSI device support
#
CONFIG_SCSI=y

#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=y
CONFIG_SD_EXTRA_DEVS=40
CONFIG_CHR_DEV_ST=m
# CONFIG_CHR_DEV_OSST is not set
CONFIG_BLK_DEV_SR=m
# CONFIG_BLK_DEV_SR_VENDOR is not set
CONFIG_SR_EXTRA_DEVS=2
CONFIG_CHR_DEV_SG=m

#
# Some SCSI devices (e.g. CD jukebox) support multiple LUNs
#
CONFIG_SCSI_MULTI_LUN=y
CONFIG_SCSI_REPORT_LUNS=y
CONFIG_SCSI_CONSTANTS=y
# CONFIG_SCSI_LOGGING is not set

#
# SCSI low-level drivers
#
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
# CONFIG_SCSI_7000FASST is not set
# CONFIG_SCSI_ACARD is not set
# CONFIG_SCSI_AHA152X is not set
# CONFIG_SCSI_AHA1542 is not set
# CONFIG_SCSI_AACRAID is not set
# CONFIG_SCSI_AIC7XXX is not set
# CONFIG_SCSI_AIC7XXX_OLD is not set
# CONFIG_SCSI_DPT_I2O is not set
# CONFIG_SCSI_ADVANSYS is not set
# CONFIG_SCSI_IN2000 is not set
# CONFIG_SCSI_AM53C974 is not set
# CONFIG_SCSI_MEGARAID is not set
# CONFIG_SCSI_BUSLOGIC is not set
# CONFIG_SCSI_CPQFCTS is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_DTC3280 is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_EATA_DMA is not set
# CONFIG_SCSI_EATA_PIO is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_GDTH is not set
# CONFIG_SCSI_GENERIC_NCR5380 is not set
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_INITIO is not set
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_NCR53C406A is not set
# CONFIG_SCSI_NCR53C7xx is not set
CONFIG_SCSI_SYM53C8XX_2=y
CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MODE=1
CONFIG_SCSI_SYM53C8XX_DEFAULT_TAGS=4
CONFIG_SCSI_SYM53C8XX_MAX_TAGS=16
# CONFIG_SCSI_SYM53C8XX_IOMAPPED is not set
# CONFIG_SCSI_PAS16 is not set
# CONFIG_SCSI_PCI2000 is not set
# CONFIG_SCSI_PCI2220I is not set
# ...

read more »

 
 
 

2.5.41-mm1 panics on boot, 2.5.41-vanilla OK

Post by Andrew Morto » Thu, 10 Oct 2002 20:40:06



> As per subject: 2.5.41-mm1 panics on boot, 2.5.41-vanilla OK.

> -mm1 panics at the point where -vanilla says "Initializing RT netlink socket".

Does this fix it?

--- 2.5.41/mm/slab.c~slab-split-10-list_for_each_fix    Tue Oct  8 15:40:52 2002

 static struct semaphore        cache_chain_sem;
 static rwlock_t cache_chain_lock = RW_LOCK_UNLOCKED;

-#define cache_chain (cache_cache.next)
+struct list_head cache_chain;

 /*

        init_MUTEX(&cache_chain_sem);
        INIT_LIST_HEAD(&cache_chain);
+       list_add(&cache_cache.next, &cache_chain);

        cache_estimate(0, cache_cache.objsize, 0,

        down(&cache_chain_sem);
        if (!n)
                return (void *)1;
-       p = &cache_cache.next;
+       p = cache_chain.next;
        while (--n) {
                p = p->next;
-               if (p == &cache_cache.next)
+               if (p == &cache_chain)
                        return NULL;
        }

        kmem_cache_t *cachep = p;
        ++*pos;
        if (p == (void *)1)
-               return &cache_cache;
-       cachep = list_entry(cachep->next.next, kmem_cache_t, next);
-       return cachep == &cache_cache ? NULL : cachep;
+               return list_entry(cache_chain.next, kmem_cache_t, next);
+       return cachep->next.next == &cache_chain ? NULL
+               : list_entry(cachep->next.next, kmem_cache_t, next);
 }

 static void s_stop(struct seq_file *m, void *p)

.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

2.5.41-mm1 panics on boot, 2.5.41-vanilla OK

Post by Miquel van Smoorenbur » Thu, 10 Oct 2002 22:50:15


According to Andrew Morton:


> > As per subject: 2.5.41-mm1 panics on boot, 2.5.41-vanilla OK.

> Does this fix it?

> --- 2.5.41/mm/slab.c~slab-split-10-list_for_each_fix       Tue Oct  8 15:40:52 2002
> +++ 2.5.41-akpm/mm/slab.c  Tue Oct  8 15:40:52 2002

Yes, it does fix it. I still get quite a lot of
"Debug: sleeping function called from illegal context" and
"bad: scheduling while atomic!" while booting, but after booting
it looks stable (well, only 8 minutes of uptime). You probably
know about those already so I'll not bore you with the
bootup log messages.

I'm now running 2.5.41-mm1 + the above patch + the raid0 patch +
the mremap fix (CONFIG_HIGHPTE -> CONFIG_HIGHMEM) on our news
peering server.

It's looking better swap-wise than 2.5.40 - it's only 116K
into swap, looks like stuff that should remain in memory
is staying there.

I'll let you know if tonights expire finishes in 15 minutes
instead of 15 hours ...

Mike.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

2.5.41-mm1 panics on boot, 2.5.41-vanilla OK

Post by Miquel van Smoorenbur » Fri, 11 Oct 2002 13:00:07


According to Miquel van Smoorenburg:

Quote:> According to Andrew Morton:
> > Does this fix it?

> > --- 2.5.41/mm/slab.c~slab-split-10-list_for_each_fix  Tue Oct  8 15:40:52 2002
> > +++ 2.5.41-akpm/mm/slab.c     Tue Oct  8 15:40:52 2002

> Yes, it does fix it.

> I'll let you know if tonights expire finishes in 15 minutes
> instead of 15 hours ...

Right, last night the server crashed when running 'expire' (that's
the news server's database update/purge) without anything in the
logs.

The time it ran before that it had significantly less throughput
than 2.4.19 has.

I'd love to tinker with this some more, reproduce the crash,
finetune the throughput, but it is a production server and
I can't keep on letting it crash during the night.

Maybe I'll try once more next week, with a telnet to the
console server in a 'screen' session so I can capture the
panic. Right now I have to lay low for a while, hiding
from my collegues ;)

Mike.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

2.5.41-mm1 panics on boot, 2.5.41-vanilla OK

Post by Danny ter Ha » Fri, 11 Oct 2002 13:10:07



Quote:>Right now I have to lay low for a while, hiding from my collegues ;)

Unless you were in another dimension this morning we (i) encouraged
you to do some more testing... ;-)

Danny

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/