NFS regularly panics the kernel

NFS regularly panics the kernel

Post by Chris Coulso » Sat, 17 Mar 2001 12:46:40



I'm using NFS to share 4 directories, used for home, common
applications, etc.

Every so often we get a kernel panic that seems related to one of two
events:
  1) Try to copy a file >100M to a shared directory
  2) Heavy use of a shared directory, e.g. installing a large
rpm/application, or compiling a lot of code on a shared directory, or
using a shared directory for application cache (mozilla, netscape).

My installation is very simple. Redhat V7, straight out of the box on
both the nfs server and clients.

Following is everything I thought would be immediately useful; Basic
Info, an example Panic, some example errors we regularly get in the
server message log, and some extracts from the message log about the
server (HD, CPU, etc).

I've been trying to work this out since Christmas without success, so
I'm pretty frustrated.

As an aside should I just give up on Linux as a NFS file server and use
Solaris?

Big thanks in advance,

Chris.

### Basic info & process listings ###
Redhat V7
kernel 2.2.16-22

root       969     1  0 12:18 ?        00:00:15 [nfsd]
root       970     1  0 12:18 ?        00:00:15 [nfsd]
root       971     1  0 12:18 ?        00:00:16 [nfsd]
root       972     1  0 12:18 ?        00:00:15 [nfsd]
root       973     1  0 12:18 ?        00:00:15 [nfsd]
root       974     1  0 12:18 ?        00:00:15 [nfsd]
root       975     1  0 12:18 ?        00:00:15 [nfsd]
root       976     1  0 12:18 ?        00:00:15 [nfsd]
root      1092     1  0 13:01 ?        00:00:00 rpc.mountd
--no-nfs-version 3
root       400     1  0 12:18 ?        00:00:00 [lockd]

### Panic ###
Mar 16 11:45:08 plato kernel: Unable to handle kernel paging request at
virtual address 33373540
Mar 16 11:45:08 plato kernel: current->tss.cr3 = 03a91000, %cr3 =
03a91000
Mar 16 11:45:08 plato kernel: *pde = 00000000
Mar 16 11:45:08 plato kernel: Oops: 0000
Mar 16 11:45:08 plato kernel: CPU:    0
Mar 16 11:45:08 plato kernel: EIP:    0010:[free_wait+46/108]
Mar 16 11:45:08 plato kernel: EFLAGS: 00010087
Mar 16 11:45:08 plato kernel: eax: 00000040   ebx: caf01410   ecx:
3337353c   edx: 3337353c
Mar 16 11:45:08 plato kernel: esi: caf0140c   edi: caf01000   ebp:
00000287   esp: cbc0ff28
Mar 16 11:45:08 plato kernel: ds: 0018   es: 0018   ss: 0018
Mar 16 11:45:08 plato kernel: Process jre (pid: 2736, process nr: 73,
stackpage=cbc0f000)
Mar 16 11:45:08 plato kernel: Stack: caf01000 00000000 c012f289 caf01000
d8f64494 00000011 d8f64498 00000030
Mar 16 11:45:08 plato kernel:        00000104 00000011 cbc0e000 00000000
00000000 caf01000 c012f696 00000011
Mar 16 11:45:08 plato kernel:        cbc0ffa8 cbc0ffa4 cbc0e000 00000000
bf1ff774 bf1ff97c d8f6448c db2e54cc
Mar 16 11:45:08 plato kernel: Call Trace: [do_select+493/516]
[sys_select+1014/1360] [system_call+52/56]
Mar 16 11:45:08 plato kernel: Code: 8b 41 04 39 d8 74 0c 8d 76 00 89 c2
8b 42 04 39 d8 75 f7 89

### Regular errors ###
Mar 16 11:25:11 plato kernel: EXT2-fs warning (device ide0(3,5)):
ext2_free_inode: bit already cleared for inode 13
Mar 16 11:25:11 plato kernel: EXT2-fs warning (device ide0(3,5)):
ext2_free_inode: bit already cleared for inode 15
Mar 16 11:25:11 plato kernel: EXT2-fs warning (device ide0(3,5)):
ext2_free_inode: bit already cleared for inode 17
Mar 16 11:25:11 plato kernel: EXT2-fs warning (device ide0(3,5)):
ext2_free_inode: bit already cleared for inode 19
Mar 16 11:27:25 plato kernel: EXT2-fs warning (device ide0(3,5)):
ext2_free_inode: bit already cleared for inode 13
Mar 16 11:27:25 plato kernel: EXT2-fs warning (device ide0(3,5)):
ext2_free_inode: bit already cleared for inode 15
Mar 16 11:27:25 plato kernel: EXT2-fs warning (device ide0(3,5)):
ext2_free_inode: bit already cleared for inode 17
Mar 16 11:27:25 plato kernel: EXT2-fs warning (device ide0(3,5)):
ext2_free_inode: bit already cleared for inode 19
Mar 16 11:29:07 plato kernel: EXT2-fs warning (device ide0(3,5)):
ext2_free_inode: bit already cleared for inode 13
Mar 16 11:29:07 plato kernel: EXT2-fs warning (device ide0(3,5)):
ext2_free_inode: bit already cleared for inode 15
Mar 16 11:29:07 plato kernel: EXT2-fs warning (device ide0(3,5)):
ext2_free_inode: bit already cleared for inode 17
Mar 16 11:29:07 plato kernel: EXT2-fs warning (device ide0(3,5)):
ext2_free_inode: bit already cleared for inode 19
Mar 16 11:30:00 plato CROND[2870]: (root) CMD (   /sbin/rmmod -as)
Mar 16 11:35:53 plato kernel: EXT2-fs warning (device ide0(3,5)):
ext2_free_inode: bit already cleared for inode 15
Mar 16 11:35:53 plato kernel: EXT2-fs warning (device ide0(3,5)):
ext2_free_inode: bit already cleared for inode 17

### Other info ###
Mar 16 11:54:24 plato kernel: Detected 1001775 kHz processor.
Mar 16 11:54:24 plato kernel: Memory: 517092k/524224k available (1048k
kernel code, 408k reserved, 5612k data, 64k init, 0k bigmem)
Mar 16 11:54:24 plato kernel: CPU: L1 I Cache: 64K  L1 D Cache: 64K
Mar 16 11:54:24 plato kernel: CPU: L2 Cache: 256K
Mar 16 11:54:24 plato kernel: CPU: AMD AMD Athlon(tm) Processor stepping
02
Mar 16 11:54:24 plato kernel: Checking 386/387 coupling... OK, FPU using
exception 16 error reporting.
Mar 16 11:54:24 plato kernel: Checking 'hlt' instruction... OK.
Mar 16 11:54:24 plato kernel: hda: IBM-DTLA-307030, ATA DISK drive
Mar 16 11:54:24 plato kernel: hda: IBM-DTLA-307030, 29314MB w/1916kB
Cache, CHS=3737/255/63
Mar 16 11:54:36 plato kernel: Installing knfsd (copyright (C) 1996

Mar 16 11:54:36 plato nfs: Starting NFS services:  succeeded
Mar 16 11:54:37 plato nfs: rpc.rquotad startup succeeded
Mar 16 11:54:37 plato nfs: rpc.mountd startup succeeded
Mar 16 11:54:37 plato nfs: rpc.nfsd startup succeeded
Mar 16 12:01:00 plato rpc.mountd: Caught signal 15, un-registering and
exiting.
Mar 16 12:01:00 plato nfs: rpc.mountd shutdown succeeded
Mar 16 12:01:00 plato nfs: rpc.mountd startup succeeded

--
Chris Coulson

 
 
 

1. FreeBSD 5.0-p6 recently started crashing regularly (panic: kmem_malloc(4096): kmem_map too small:)

Hi,

I've been running a box 5.0-RELEASE system since january and have updated
the patches to -p6.  I just recently started having problems with the system
crashing.

It started this past saturday...

The server has crashed at the following times:





Nothing seemed to be capturing anything in the logs, so on June 11th, I
stayed up and watched the server carefully... and sure enough it crashed at
3:06 am... I managed to catch this on the console before the system went
down.

panic: kmem_malloc(4096): kmem_map too small: 275378176 total allocated
cpuid = 3; lapic.id = 07000000
boot() called on cpu#3
syncing disks, buffers remaining... panic: bwrite: buffer is not busy???
cpuid = 3; lapic.id = 07000000
boot() called on cpu#3
Uptime: 23h41m49s
Terminate ACPI
Automatic reboot in 15 seconds - press a key on the console to abort
Rebooting...
cpu_reset called on cpu#3
cpu_reset: Stopping other CPUs
cpu_reset: Restarting BSP
cpu_reset_proxy: Stopped CPU 3

I haven't seen the above before... and don't have the slightest idea why it
just started... the system at the time was running "periodic daily" and
processing some mail in the queue (sendmail).  It was also serving a couple
http requests that were accessing the mysql database which is local and a
mysql slave server was slaving that data.... this setup has been running
since january and the tasks running at the time of the latest crash are
normal for this time of night.

Here are the hardware specs:
http://www.supermicro.com/PRODUCT/SUPERServer/SuperServer6012P-6.htm
dual 2.2Ghz Xeon w/ 4Gigs of RAM

Let me know if I can provide any further data to help resolve this
problem... I suspect the system will "crash" again in 24 hours.

Thanks,
Stephane.

2. Adaptec 2940 and X86 2.5?

3. Kernel 2.0.30 crashes regularly and destroys file systems

4. .htaccess manipulation in Apache 1.3.0

5. 1.0.9 kernel crashes regularly 30 minutes after any floppy access

6. For the best golfing experience of your life !

7. kernel nfs client panics Solaris2.6 server

8. Adaptec 1502 SCSI card under 2.0.27

9. kernel panic trying to mount NFS as root partition

10. 3c59x + highmem + acpi + nfs -> kernel panic

11. NFS server kernel panic

12. Kernel panic with NFS and SMP

13. Kernel NFS: equivalent to userland nfsd -r in kernel NFS