Quite frequently, when attaching to a process using gdb or investigating
a core dump using gdb, I find that some of the stack frames have been
trashed, and are usually shown as a call to end() or something random
like that.
Frequently, in fact, the stack frame that gets trashed is the one in
which a signal handler was invoked (typically because it did something
bogus). This can make it a real pain in the ass to figure out what
went wrong, if the trashed stack frame was in a large function.
Here's an example of what I'm talking about:
#0 0x60018cf9 in end ()
#1 0x12c0 in mark_device (obj={...}, markobj=0x6) at device.c:114
#2 0xa5fd3 in fatal_error_signal (sig=6) at emacs.c:171
#3 0xbffff4dc in end ()
#4 0x6001d22a in end ()
#5 0x60007963 in end ()
#6 0xa7fa1 in assert_failed (file=0x1102f7 "eval.c", line=1709,
expr=0x112c77 "abort()") at emacs.c:1589
#7 0x113a84 in signal_1 (sig={...}, data={...}) at eval.c:1709
#8 0x114161 in Fsignal (sig={...}, data={...}) at eval.c:1855
#9 0x1141c1 in signal_error (sig={...}, data={...}) at eval.c:1863
#10 0x104a79 in arith_error (signo=8) at data.c:1437
#11 0xbffff658 in end ()
#12 0x1bb24c in xlw_update_one_widget (instance=0x3a72e0, widget=0x38d600,
val=0x361f80, deep_p=0 '\000') at lwlib-Xlw.c:376
#13 0x1b925d in set_one_value (instance=0x3a72e0, val=0x361f80,
deep_p=0 '\000') at lwlib.c:662
#14 0x1b92f3 in update_one_widget_instance (instance=0x3a72e0, deep_p=0
'\000')
at lwlib.c:686
#15 0x1b9346 in update_all_widget_values (info=0x31c980, deep_p=0 '\000')
at lwlib.c:696
#16 0x1b9586 in lw_modify_all_widgets (id=65541, val=0x344080, deep_p=0
'\000')
at lwlib.c:747
#17 0xa1b95 in x_update_scrollbar_instance_status (w=0x2c8600, active=1,
size=15, instance=0x2beec0) at scrollbar-x.c:267
#18 0x1a7ac6 in update_window_scrollbars (w=0x2c8600, mirror=0x321280,
active=1, horiz_only=0) at scrollbar.c:264
#19 0x3a7d5 in redisplay_output_window (w=0x2c8600) at redisplay-output.c:1312
#20 0x2aa8f in redisplay_window (window={...}, skip_selected=0)
at redisplay.c:4709
#21 0x2b0b0 in redisplay_frame (f=0x275700) at redisplay.c:4815
#22 0x2b7f1 in redisplay_device (d=0x2fb900) at redisplay.c:4918
#23 0x2bb91 in redisplay_without_hooks () at redisplay.c:4982
#24 0x2bdf0 in redisplay () at redisplay.c:5044
Stack frames 0, 1, 3, 4, 5, and 11 are trashed. The real stack
should look something like
----- process died -----
kill (MYPID, 6)
fatal_error_signal (sig=6)
----- signal handler called (6) -----
abort ()
assert_failed (...)
signal_1 (...)
Fsignal (...)
signal_error (...)
arith_error (sig=8)
----- signal handler called (8)
xlw_update_scrollbar (...) at line XXX
xlw_update_one_widget (...)
etc.
The only frame I care about is #11 (in xlw_update_scrollbar()), and this
is the most important one in the whole stack trace because it
says where the real problem lies.
(BTW, I'm using GDB 4.14.)
What is the cause of this behavior? When debugging the same program
under Solaris, I don't see the problem -- the stack trace correctly
shows all signal invocations, the calls to abort() and kill(), etc.
This is using DBX.
Is this a bug in GDB or Linux?
ben
--
"... then the day came when the risk to remain tight in a bud was
more painful than the risk it took to blossom." -- Anais Nin