Greetings,
I have several RH7.1 machines with identical configs,
Tyan Thunder, 2x Athlon 1.2, 1G ram 2G swap with single
9G scsi disks. All were using 2.4.2-2smp.
One of these machines and only one has exhibited strange
behaviour for about 6 weeks. Periodically, about once a week
it freezes up and i have to reset it, having no access to
the console. I can ping it but thats it. logs showed
kernel BUG messages related to swapping and paging. I upgraded to
2.4.9-12smp which gave a slightly different pattern of crashes
and kermnel bug messages, one of which is appended at the end.
In addition, this machine never seems to use any swap, even when
I run a bunch of apps, pigs like netscape. I tried swapoff,mkswap -c,
swapon -p 32000 and still no swap activity. All the other machines show
some small swap usage.
Sometimes the crashes happen when users are using vmware, othertimes
druing the night when syslog restarts.
I assume it is a hardware problem.
I can get my supplier to fix this machine, but I wanted to have
something more informative to say than "it's broken".
Any ideas?
TIA
Christopher
Jan 3 15:58:18 data kernel: <4>probable hardware bug: clock timer
configuration lost - probably a VIA686a motherboard.
Jan 3 15:58:18 data kernel: probable hardware bug: restoring chip
configuration.
Jan 3 15:58:20 data kernel: ------------[ cut here ]------------
Jan 3 15:58:20 data kernel: kernel BUG at page_alloc.c:85!
Jan 3 15:58:20 data kernel: invalid operand: 0000
Jan 3 15:58:20 data kernel: CPU: 0
Jan 3 15:58:20 data kernel: EIP: 0010:[__free_pages_ok+43/928]
Not tainted
Jan 3 15:58:20 data kernel: EIP: 0010:[<c013624b>] Not tainted
Jan 3 15:58:20 data kernel: EFLAGS: 00013282
Jan 3 15:58:20 data kernel: eax: 0000001f ebx: c208de28 ecx:
c02fd544 edx: 00011fd6
Jan 3 15:58:20 data kernel: esi: ed53f3c0 edi: 00000004 ebp:
c208de28 esp: ed5b9e60
Jan 3 15:58:20 data kernel: ds: 0018 es: 0018 ss: 0018
Jan 3 15:58:20 data kernel: Process X (pid: 6743, stackpage=ed5b9000)
Jan 3 15:58:20 data kernel: Stack: c023999e 00000055 fe2cb000 00078b00
00000000 c20e8fa4 c01413a2 00000000
Jan 3 15:58:20 data kernel: c208de28 c208de28 00000004 0000004b
c0137425 c01307ec f7678de0 c0361020
Jan 3 15:58:20 data kernel: f781a8a0 c208de28 c036115c c0129a47
c208de28 000000dc 00152000 00000040
Jan 3 15:58:20 data kernel: Call Trace: [copyrite+26078/27691] copyrite
[kernel] 0x65de Jan 3 15:58:20 data kernel: Call Trace: [<c023999e>]
copyrite [kernel] 0x65de
Jan 3 15:58:20 data kernel: [generic_commit_write+146/160]
generic_commit_write [kernel] 0x92
Jan 3 15:58:20 data kernel: [<c01413a2>] generic_commit_write [kernel]
0x92
Jan 3 15:58:20 data kernel: [free_page_and_swap_cache+197/208]
free_page_and_swap_cache [kernel] 0xc5
Jan 3 15:58:20 data kernel: [<c0137425>] free_page_and_swap_cache
[kernel] 0xc5
Jan 3 15:58:20 data kernel: [generic_file_write+1052/1568]
generic_file_write [kernel] 0x41c
Jan 3 15:58:20 data kernel: [<c01307ec>] generic_file_write [kernel] 0x41c
Jan 3 15:58:20 data kernel: [zap_page_range+1143/1232] zap_page_range
[kernel] 0x477
Jan 3 15:58:20 data kernel: [<c0129a47>] zap_page_range [kernel] 0x477
Jan 3 15:58:20 data kernel: [dput+28/384] dput [kernel] 0x1c
Jan 3 15:58:20 data kernel: [<c0150c8c>] dput [kernel] 0x1c
Jan 3 15:58:20 data kernel: [exit_mmap+201/304] exit_mmap [kernel] 0xc9
Jan 3 15:58:20 data kernel: [<c012c579>] exit_mmap [kernel] 0xc9
Jan 3 15:58:20 data kernel: [mmput+91/128] mmput [kernel] 0x5b
Jan 3 15:58:20 data kernel: [<c0119bfb>] mmput [kernel] 0x5b
Jan 3 15:58:20 data kernel: [do_exit+230/624] do_exit [kernel] 0xe6
Jan 3 15:58:20 data kernel: [<c011e446>] do_exit [kernel] 0xe6
Jan 3 15:58:20 data kernel: [filp_close+158/176] filp_close [kernel] 0x9e
Jan 3 15:58:20 data kernel: [<c013d3be>] filp_close [kernel] 0x9e
Jan 3 15:58:20 data kernel: [system_call+51/56] system_call [kernel] 0x33
Jan 3 15:58:20 data kernel: [<c010719b>] system_call [kernel] 0x33
Jan 3 15:58:20 data kernel:
Jan 3 15:58:20 data kernel:
Jan 3 15:58:20 data kernel: Code: 0f 0b 59 5b 8b 55 08 85 d2 74 10 6a
57 68 9e 99 23 c0 e8 ce
Jan 3 15:58:28 data kernel: ------------[ cut here ]------------
Jan 3 15:58:28 data kernel: kernel BUG at page_alloc.c:85!