I'm using NFS to share 4 directories, used for home, common
applications, etc.
Every so often we get a kernel panic that seems related to one of two
events:
1) Try to copy a file >100M to a shared directory
2) Heavy use of a shared directory, e.g. installing a large
rpm/application, or compiling a lot of code on a shared directory, or
using a shared directory for application cache (mozilla, netscape).
My installation is very simple. Redhat V7, straight out of the box on
both the nfs server and clients.
Following is everything I thought would be immediately useful; Basic
Info, an example Panic, some example errors we regularly get in the
server message log, and some extracts from the message log about the
server (HD, CPU, etc).
I've been trying to work this out since Christmas without success, so
I'm pretty frustrated.
As an aside should I just give up on Linux as a NFS file server and use
Solaris?
Big thanks in advance,
Chris.
### Basic info & process listings ###
Redhat V7
kernel 2.2.16-22
root 969 1 0 12:18 ? 00:00:15 [nfsd]
root 970 1 0 12:18 ? 00:00:15 [nfsd]
root 971 1 0 12:18 ? 00:00:16 [nfsd]
root 972 1 0 12:18 ? 00:00:15 [nfsd]
root 973 1 0 12:18 ? 00:00:15 [nfsd]
root 974 1 0 12:18 ? 00:00:15 [nfsd]
root 975 1 0 12:18 ? 00:00:15 [nfsd]
root 976 1 0 12:18 ? 00:00:15 [nfsd]
root 1092 1 0 13:01 ? 00:00:00 rpc.mountd
--no-nfs-version 3
root 400 1 0 12:18 ? 00:00:00 [lockd]
### Panic ###
Mar 16 11:45:08 plato kernel: Unable to handle kernel paging request at
virtual address 33373540
Mar 16 11:45:08 plato kernel: current->tss.cr3 = 03a91000, %cr3 =
03a91000
Mar 16 11:45:08 plato kernel: *pde = 00000000
Mar 16 11:45:08 plato kernel: Oops: 0000
Mar 16 11:45:08 plato kernel: CPU: 0
Mar 16 11:45:08 plato kernel: EIP: 0010:[free_wait+46/108]
Mar 16 11:45:08 plato kernel: EFLAGS: 00010087
Mar 16 11:45:08 plato kernel: eax: 00000040 ebx: caf01410 ecx:
3337353c edx: 3337353c
Mar 16 11:45:08 plato kernel: esi: caf0140c edi: caf01000 ebp:
00000287 esp: cbc0ff28
Mar 16 11:45:08 plato kernel: ds: 0018 es: 0018 ss: 0018
Mar 16 11:45:08 plato kernel: Process jre (pid: 2736, process nr: 73,
stackpage=cbc0f000)
Mar 16 11:45:08 plato kernel: Stack: caf01000 00000000 c012f289 caf01000
d8f64494 00000011 d8f64498 00000030
Mar 16 11:45:08 plato kernel: 00000104 00000011 cbc0e000 00000000
00000000 caf01000 c012f696 00000011
Mar 16 11:45:08 plato kernel: cbc0ffa8 cbc0ffa4 cbc0e000 00000000
bf1ff774 bf1ff97c d8f6448c db2e54cc
Mar 16 11:45:08 plato kernel: Call Trace: [do_select+493/516]
[sys_select+1014/1360] [system_call+52/56]
Mar 16 11:45:08 plato kernel: Code: 8b 41 04 39 d8 74 0c 8d 76 00 89 c2
8b 42 04 39 d8 75 f7 89
### Regular errors ###
Mar 16 11:25:11 plato kernel: EXT2-fs warning (device ide0(3,5)):
ext2_free_inode: bit already cleared for inode 13
Mar 16 11:25:11 plato kernel: EXT2-fs warning (device ide0(3,5)):
ext2_free_inode: bit already cleared for inode 15
Mar 16 11:25:11 plato kernel: EXT2-fs warning (device ide0(3,5)):
ext2_free_inode: bit already cleared for inode 17
Mar 16 11:25:11 plato kernel: EXT2-fs warning (device ide0(3,5)):
ext2_free_inode: bit already cleared for inode 19
Mar 16 11:27:25 plato kernel: EXT2-fs warning (device ide0(3,5)):
ext2_free_inode: bit already cleared for inode 13
Mar 16 11:27:25 plato kernel: EXT2-fs warning (device ide0(3,5)):
ext2_free_inode: bit already cleared for inode 15
Mar 16 11:27:25 plato kernel: EXT2-fs warning (device ide0(3,5)):
ext2_free_inode: bit already cleared for inode 17
Mar 16 11:27:25 plato kernel: EXT2-fs warning (device ide0(3,5)):
ext2_free_inode: bit already cleared for inode 19
Mar 16 11:29:07 plato kernel: EXT2-fs warning (device ide0(3,5)):
ext2_free_inode: bit already cleared for inode 13
Mar 16 11:29:07 plato kernel: EXT2-fs warning (device ide0(3,5)):
ext2_free_inode: bit already cleared for inode 15
Mar 16 11:29:07 plato kernel: EXT2-fs warning (device ide0(3,5)):
ext2_free_inode: bit already cleared for inode 17
Mar 16 11:29:07 plato kernel: EXT2-fs warning (device ide0(3,5)):
ext2_free_inode: bit already cleared for inode 19
Mar 16 11:30:00 plato CROND[2870]: (root) CMD ( /sbin/rmmod -as)
Mar 16 11:35:53 plato kernel: EXT2-fs warning (device ide0(3,5)):
ext2_free_inode: bit already cleared for inode 15
Mar 16 11:35:53 plato kernel: EXT2-fs warning (device ide0(3,5)):
ext2_free_inode: bit already cleared for inode 17
### Other info ###
Mar 16 11:54:24 plato kernel: Detected 1001775 kHz processor.
Mar 16 11:54:24 plato kernel: Memory: 517092k/524224k available (1048k
kernel code, 408k reserved, 5612k data, 64k init, 0k bigmem)
Mar 16 11:54:24 plato kernel: CPU: L1 I Cache: 64K L1 D Cache: 64K
Mar 16 11:54:24 plato kernel: CPU: L2 Cache: 256K
Mar 16 11:54:24 plato kernel: CPU: AMD AMD Athlon(tm) Processor stepping
02
Mar 16 11:54:24 plato kernel: Checking 386/387 coupling... OK, FPU using
exception 16 error reporting.
Mar 16 11:54:24 plato kernel: Checking 'hlt' instruction... OK.
Mar 16 11:54:24 plato kernel: hda: IBM-DTLA-307030, ATA DISK drive
Mar 16 11:54:24 plato kernel: hda: IBM-DTLA-307030, 29314MB w/1916kB
Cache, CHS=3737/255/63
Mar 16 11:54:36 plato kernel: Installing knfsd (copyright (C) 1996
Mar 16 11:54:36 plato nfs: Starting NFS services: succeeded
Mar 16 11:54:37 plato nfs: rpc.rquotad startup succeeded
Mar 16 11:54:37 plato nfs: rpc.mountd startup succeeded
Mar 16 11:54:37 plato nfs: rpc.nfsd startup succeeded
Mar 16 12:01:00 plato rpc.mountd: Caught signal 15, un-registering and
exiting.
Mar 16 12:01:00 plato nfs: rpc.mountd shutdown succeeded
Mar 16 12:01:00 plato nfs: rpc.mountd startup succeeded
--
Chris Coulson