Hi,
I recently upgraded to 2.4.20 (from 2.2.24) and I'm getting rather
frequent crashes:
I BELIEVE I've traced the EIP to the function xprt_timer.
Basically, it said the EIP was at c02ceeb2 and the function
xprt_timer in System.map is at c02cee5c.
I objdump --disassemble d xprt.o and tried to look at it, but it's way
beyond my knowledge of intel assembler. The machine code it prints tho,
matches the code that the kernel oops prints out exactly tho.
It says that it's a NULL pointer dereference... I could obviously
just put null pointer checks everywhere in the function, but I don't know
what that would do to performance. (Of course, if it's performance degradation
or crashing all the time at 120 mph, I'll take performance degradation any day
of the week)
I'm going to post the disassembly, xprt_timer, and my TYPED copy of the
Kernel oops (Because, it doesn't write this message to any log...)
and hope some kind soul could help me debug it.
(I'll apologize for the length of this post in advance)
Here's the Kernel Ooops text:
<oops>
Unable to handle kernel NULL pointer dereference at virtual address 00000058
printing eip:
c02ceeb2
*pde = 00000000
Oops: 0000
CPU: 1
EIP: 0010 [<c02ceeb2>] Not tainted
EFLAGS: 00010202
eax: 0000002c ebx: e0cdf3d8 ecx: 00000008 edx: 00000001
esi: 00000000 edi: 00000001 ebp: d817be6c esp: d817be54
ds: 0018 es: 0018 ss: 0018
Process cjpeg (pid: 7256, stackpage=d817b000)
Stack: d91d4080 c02cee5c 00000001 f7b7ad40 00000000 e0cdf000 d817be84 c02d00da
d91d4080 d91d4080 c02d0050 00000000 d817bebc c012239c d91d4080 00000000
00000020 00000000 c221de20 00000001 00000020 c0403b2c c0403b2c d817bebc
Call Trace:
[<c02cee5c>] [<c02d00da>] [<c02d0050>] [<c012239c>] [<c011f016>]
[<c011eee3>] [<c011ec5d>] [<c010a66d>] [<c02d74d0>] [<c0142ce4>]
[<c0139c68>] [<c0108c3f>]
Code: 8b 40 2c 89 7d f4 83 f8 09 0f 4c c8 b8 01 00 00 00 89 c2 d3
</oops>
Here is the C code for xprt_timer (it's unaltered from 2.4.20)
<xprt_timer>
/*
* RPC receive timeout handler.
*/
static void
xprt_timer(struct rpc_task *task)
{
struct rpc_rqst *req = task->tk_rqstp;
struct rpc_xprt *xprt = req->rq_xprt;
spin_lock(&xprt->sock_lock);
if (req->rq_received)
goto out;
if (!xprt->nocong) {
if (xprt_expbackoff(task, req)) {
rpc_add_timer(task, xprt_timer);
goto out_unlock;
}
rpc_inc_timeo(&task->tk_client->cl_rtt);
xprt_adjust_cwnd(req->rq_xprt, -ETIMEDOUT);
}
req->rq_nresend++;
dprintk("RPC: %4d xprt_timer (%s request)\n",
task->tk_pid, req ? "pending" : "backlogged");
task->tk_status = -ETIMEDOUT;
out:
task->tk_timeout = 0;
rpc_wake_up_task(task);
out_unlock:
spin_unlock(&xprt->sock_lock);
Here is the assembler of JUST xprt_timer (I have posted the entire
assembler output at
http://www.hex21.com/~binesh/xprt.s)
<assembler>
0000134c <xprt_timer>:
134c: 55 push %ebp
134d: 89 e5 mov %esp,%ebp
134f: 83 ec 0c sub $0xc,%esp
1352: 57 push %edi
1353: 56 push %esi
1354: 53 push %ebx
1355: 8b 45 08 mov 0x8(%ebp),%eax
1358: 8b 58 18 mov 0x18(%eax),%ebx
135b: 8b 13 mov (%ebx),%edx
135d: 89 55 fc mov %edx,0xfffffffc(%ebp)
1360: f0 fe 8a 98 09 00 00 lock decb 0x998(%edx)
1367: 0f 88 ef 10 00 00 js 245c <.text.lock.xprt+0x107>
136d: 8b 43 6c mov 0x6c(%ebx),%eax
1370: 89 45 f8 mov %eax,0xfffffff8(%ebp)
1373: 85 c0 test %eax,%eax
1375: 0f 85 af 00 00 00 jne 142a <xprt_timer+0xde>
137b: f6 82 7c 09 00 00 02 testb $0x2,0x97c(%edx)
1382: 75 5f jne 13e3 <xprt_timer+0x97>
1384: 8b bb 88 00 00 00 mov 0x88(%ebx),%edi
138a: 8b 45 08 mov 0x8(%ebp),%eax
138d: 8d 57 01 lea 0x1(%edi),%edx
1390: 89 93 88 00 00 00 mov %edx,0x88(%ebx)
1396: b9 08 00 00 00 mov $0x8,%ecx
139b: 47 inc %edi
139c: 8b 70 14 mov 0x14(%eax),%esi
139f: 8d 46 2c lea 0x2c(%esi),%eax
13a2: 8b 40 2c mov 0x2c(%eax),%eax
13a5: 89 7d f4 mov %edi,0xfffffff4(%ebp)
13a8: 83 f8 09 cmp $0x9,%eax
13ab: 0f 4c c8 cmovl %eax,%ecx
13ae: b8 01 00 00 00 mov $0x1,%eax
13b3: 89 c2 mov %eax,%edx
13b5: d3 e2 shl %cl,%edx
13b7: 39 55 f4 cmp %edx,0xfffffff4(%ebp)
13ba: 0f 4d 45 f8 cmovge 0xfffffff8(%ebp),%eax
13be: 85 c0 test %eax,%eax
13c0: 74 10 je 13d2 <xprt_timer+0x86>
13c2: 68 4c 13 00 00 push $0x134c
13c7: 8b 55 08 mov 0x8(%ebp),%edx
13ca: 52 push %edx
13cb: e8 fc ff ff ff call 13cc <xprt_timer+0x80>
13d0: eb 68 jmp 143a <xprt_timer+0xee>
13d2: f0 ff 46 58 lock incl 0x58(%esi)
13d6: 6a 92 push $0xffffff92
13d8: 8b 03 mov (%ebx),%eax
13da: 50 push %eax
13db: e8 f4 ed ff ff call 1d4 <xprt_adjust_cwnd>
13e0: 83 c4 08 add $0x8,%esp
13e3: ff 83 8c 00 00 00 incl 0x8c(%ebx)
13e9: f6 05 00 00 00 00 01 testb $0x1,0x0
13f0: 74 2e je 1420 <xprt_timer+0xd4>
13f2: b8 36 07 00 00 mov $0x736,%eax
13f7: ba 2e 07 00 00 mov $0x72e,%edx
13fc: 85 db test %ebx,%ebx
13fe: 0f 45 c2 cmovne %edx,%eax
1401: 50 push %eax
1402: 8b 55 08 mov 0x8(%ebp),%edx
1405: 0f b7 82 80 00 00 00 movzwl 0x80(%edx),%eax
140c: 50 push %eax
140d: 68 60 07 00 00 push $0x760
1412: e8 fc ff ff ff call 1413 <xprt_timer+0xc7>
1417: 83 c4 0c add $0xc,%esp
141a: 8d b6 00 00 00 00 lea 0x0(%esi),%esi
1420: 8b 45 08 mov 0x8(%ebp),%eax
1423: c7 40 1c 92 ff ff ff movl $0xffffff92,0x1c(%eax)
142a: 8b 55 08 mov 0x8(%ebp),%edx
142d: c7 42 74 00 00 00 00 movl $0x0,0x74(%edx)
1434: 52 push %edx
1435: e8 fc ff ff ff call 1436 <xprt_timer+0xea>
143a: 8b 55 fc mov 0xfffffffc(%ebp),%edx
143d: b0 01 mov $0x1,%al
143f: 86 82 98 09 00 00 xchg %al,0x998(%edx)
1445: 8d 65 e8 lea 0xffffffe8(%ebp),%esp
1448: 5b pop %ebx
1449: 5e pop %esi
144a: 5f pop %edi
144b: 89 ec mov %ebp,%esp
144d: 5d pop %ebp
144e: c3 ret
144f: 90 nop
</assembler>
Thanks!
Binesh Bannerjee
--
"Some say the world will end in fire,
Some say in ice.
From what I've tasted of desire
I hold with those who favor fire.
But if it had to perish twice
I think I know enough of hate
To know that for destruction ice
Is also great
And would suffice."
-- "Fire and Ice" by Robert Frost
PGP Key: http://www.hex21.com/~binesh/binesh-public.asc
Key fingerprint = 421D B4C2 2E96 B8EE 7190 A0CF B42F E71C 7FC3 AD96
SSH2 Key: http://www.hex21.com/~binesh/binesh-ssh2.pub
SSH1 Key: http://www.hex21.com/~binesh/binesh-ssh1.pub
OpenSSH Key: http://www.hex21.com/~binesh/binesh-openssh.pub