Stack backtrace from signal handler

Stack backtrace from signal handler

Post by Joerg Brueh » Sun, 05 May 2002 00:49:17



Dear developer colleagues,

some years ago, I had posted a code fragment that I use to
generate a stack backtrace for SIGSEGV etc.
We needed that because the IBM debugger (at least at that time)
would not analyze our 'core' file due to some non-standard
stack frame linkage manipulations we had done (for our own
"internal tasking", comparable to user-level threads with
out own scheduling).

As that posting had been incomplete, I got and still get some
questions about the missing parts. So there seems to be a need
for this, and I re-post it now (complete, after all).

Below, you will find three parts:
1) a C code fragment that follows the chain of stack frames
   backwards from the point of failure (= fatal signal) towards
   'main', it writes each address as a hex value to a diagnostic file;
2) a piece of assembler code that helps to check the validity of
   the start point of that chain (the first address must be
   numerically greater than the current stack address);
3) some shell skript code that analyses the file written by the
   C code and merges those (numeric) addresses with the namelist.

I hope it may be useful for some of you.

Regards,
Joerg Bruehe

=-=-=-= cut here =-=-=-= C code fragments =-=-=-=

#include <signal.h>
#include <stdio.h>

/* This type does not come from any AIX "include" file, we defined
 * it ourselves (based on the information about "Subroutine Linkage
 * Convention", stack frames etc. in an IBM book "Assembly Language
 * Reference", 1990 edition, IBM order number SC23-2197).
 *
 * This version comes from the 32 bit world -
 * information about the 64 bit stack frames would be very welcome.
 */

struct STACK_TYPE {
      long sp ; /* stack pointer */
      long cr ; /* "condition register" */
      long lr ; /* "link register" == return address */
      long d1 ; /* ... ? ... */
      long d2 ; /* ... ? ... */
      long toc ; /* "table of contents" */
      };

typedef void SIGNALFUNCTYPE ; /* "int" for some older Unix variants */

/* imported functions */

#if _IBMR2
extern  struct STACK_TYPE  * e71a_own_frame (void);
#endif

/* local functions */

#ifdef _IBMR2
static  SIGNALFUNCTYPE  e81_core_handler
        ( int sig , int dummy , struct sigcontext * SCP );
#endif

/* ===== */

int     main ( int argc , char ** argv )

{
#ifdef _IBMR2
    struct sigaction                force_full_dump ;
#endif

    /*
     *  Initalize signal handling.
     *  These should not stop the program.
     */
    (void) signal ( SIGHUP  , SIG_IGN );
    (void) signal ( SIGINT  , SIG_IGN );
#ifndef LINUX
    (void) signal ( SIGSYS  , SIG_IGN );
#endif
    (void) signal ( SIGPIPE , SIG_IGN );
#ifdef SIGWINCH
    (void) signal ( SIGWINCH , SIG_IGN );
#endif
    /*
     *  These should crash the program.
     */
    (void) signal ( SIGFPE  , SIG_DFL );
    (void) signal ( SIGBUS  , SIG_DFL );
    (void) signal ( SIGSEGV , SIG_DFL );

#ifdef _IBMR2
    /* establish the routine to catch and analyze
     * these fatal signals.
     */
    force_full_dump.sa_handler = e81_core_handler ;
        /* gives a type mismatch warning, because the signal
         * handler function type definition has only one
         * parameter, but AIX implementation has three
         * and we really need the third one!
         */
    SIGINITSET ( force_full_dump.sa_mask );
    force_full_dump.sa_flags = SA_FULLDUMP | SA_OLDSTYLE ;
      /* 'OLDSTYLE' is needed so that the handler may kill the process.
*/
    (void) sigaction ( SIGILL  , &force_full_dump , NULL );
    (void) sigaction ( SIGFPE  , &force_full_dump , NULL );
    (void) sigaction ( SIGBUS  , &force_full_dump , NULL );
    (void) sigaction ( SIGSEGV , &force_full_dump , NULL );
#endif

    /*
     *  Code deleted ...  Here follows the real work of the program ...
     */
.....

}

/* ===== */

/*
 *  This function can be used to write the stack backtrace of a
 *  dieing process into the diag file to evade debugger
 *  incompetencies (needed on RS/6000)
 *  and to do any other operations while the process is still alive,
 *  e.g. to move to another directory to save the core file (inactive).
 */

#ifdef _IBMR2
static  SIGNALFUNCTYPE  e81_core_handler
        ( int sig , int dummy , struct sigcontext * SCP )

{
    struct STACK_TYPE           *that_frame ;
    struct STACK_TYPE           *prev_frame ;

    printf ( "e81_core_handler: ABORTING due to signal %d \n", sig );

    prev_frame = e71a_own_frame () ;
    that_frame = (struct STACK_TYPE *)
                 SCP->sc_jmpbuf.jmp_context.gpr [ 1 ] ;

    /* "stdout" must be the diagnostic file */
    printf ( "e81_core_handler: current inst.addr.reg 0x%08lx\n",
                    SCP->sc_jmpbuf.jmp_context.iar ); /* 'iar' saved */

    while ( that_frame > prev_frame ) /* stack grows from high to low
addr */
    {
        printf ( "e81_core_handler: called from code addr 0x%08lx\n",
                    that_frame->lr ); /* 'link register' saved */
        prev_frame = that_frame ;
        that_frame = (struct STACK_TYPE *) that_frame->sp /* previous
level */
;
    }

    /*
     *  Signal is reset to SIG_DFL on entry of this function
('SA_OLDSTYLE').
     *  The following reissues the signal and creates a core.
     *  SIGIOT should force a core if 'sig' is ignored (_IBMR2),
     *  SIGKILL is a last resort to ensure termination.
     */
    (void) kill ( getpid() , sig );
    (void) kill ( getpid() , SIGIOT  );
    (void) kill ( getpid() , SIGKILL );
    pause ();

    /*NOTREACHED*/

}

#endif /*_IBMR2*/

=-=-=-= cut here =-=-=-= the assembler code giving the start point:
=-=-=-=

# This is the routine to return the own stack pointer as a result.

                .globl          .e71a_own_frame[pr]
                .csect          .e71a_own_frame[pr]
# routine is extremely primitive: no stack frame etc.
.e71a_own_frame:
# first, clear register (GPR) 3 (to be sure)
                cal         3,0(0)
# now, add GPR1 (= the stack pointer) to this (empty) reg.
                a           3,1,3
# finally, return (GPR 3 is the result register, see Assembler docu)
                br

=-=-=-= cut here =-=-=-= the shell code merging with the namelist:
=-=-=-=

# Here comes part of the shell skript to analyze that diagnostic file:
# PROG     the filename of the program that crashed
# DIAGFIL  the file that got the 'printf' in the C "core handler"
routine
# TMP, TMP1, TMP2  temporary files
# OUT      the analysis output file

###

_check_namelist_valid ()
# sets VALID to 1 if a valid namelist was found, to 0 otherwise
{
SUM=`sum $PROG`

NAMELIST=`basename $PROG`.nm
VALID=0
if [ -f $NAMELIST ]
then # we have a name list file for $PROG - is it valid ?
    FILSUM=`head -1 $NAMELIST | cut -c32-`
    if [ "$FILSUM" = "$SUM" ]
    then
        VALID=1
    fi
fi

}

###

_make_valid_namelist ()
# namelist missing or invalid, create new, attention: checksum format !
{
echo "0x00000000|...  UNIX checksum: $SUM" > $NAMELIST
echo "0x00000000|... `ls -l $PROG`" >> $NAMELIST
if [ ! -w $NAMELIST ]
then
    echo "$IAM : Cannot write namelist file '$NAMELIST' - ABORT"
    exit 9
fi

case "`uname -srv`" in
"AIX 2 3" )
    nm -vxT $PROG | fgrep '.text' | cut -c1-32 > $TMP
    cut -c22-32 $TMP > $TMP1
    cut -c1-20  $TMP > $TMP2
    paste -d'\0' $TMP1 $TMP2 | uniq >> $NAMELIST
    rm -f  $TMP $TMP1 $TMP2 ;;
"AIX 1 4" | "AIX 2 4" | "AIX 3 4" )
    nm -vxT $PROG | grep ' [tT] ' | cut -c2-41 > $TMP
    cut -c23-32 $TMP | sed 's/0x1/0x0/' > $TMP1
    cut -c1-20,34-  $TMP > $TMP2
    paste -d'|' $TMP1 $TMP2 | uniq >> $NAMELIST
    rm -f  $TMP $TMP1 $TMP2 ;;
* )
    echo "nm output format not yet analyzed" ;;
esac

}

###

_generate_backtrace ()
{
fgrep 'e81_core_handler' $DIAGFIL | fgrep ' 0x' | \
   sed 's/^.* 0x/0x/' > $TMP
for ADDR in `head -35 $TMP`
do
    echo ' ' >> $OUT

    case "$ADDR" in
    0x10* ) # an address in the text segment - work on it
            ADDR_NM=`echo $ADDR | sed 's/0x1/0x0/'`
            echo "$ADDR_NM|ADDRESS IN BACKTRACE" > $TMP1
            sort $NAMELIST $TMP1 > $TMP2
            ed - $TMP2 >> $OUT << +
H
/ADDRESS IN BACKTRACE/-3,/ADDRESS IN BACKTRACE/+2p
q
+
            ;;
    * )     # obviously invalid
            echo '   (not in text segment, invalid stack)' >> $OUT
            ;;
    esac
done
echo ' ' >> $OUT

}

###

=-=-=-= cut here =-=-=-= end of code fragments =-=-=-=

--
Joerg Bruehe, SQL Datenbanksysteme GmbH, Berlin, Germany
     (speaking only for himself)
mailto: jo...@sql.de

 
 
 

1. Signal handlers inside signal handlers

Greetings Netters,

Unfortunately, the project I'm working on requires I mess with nested signal
handlers, and I've checked out obvious manuals and the POSIX std for clues,
but I'm having no luck.

What I'm trying to do is within a signal handler, plant another handler.
For example

#include        blah blah blah

foo2( int signo )
{
        printf("Caught the second sigalrm\n");

foo1( int signo )
{
        printf("Caught the first sigalrm\n");
        signal( SIGARLM, foo2 );
        alarm(1);

        for (;;)
                ;

main()
{
        signal( SIGALRM, foo1 );
        alarm(3);

        /* Wait for the first alarm */
        sleep(10);

The above program when run, prints the message from function foo1
but never reaches foo2.  Note that my project dictates that I can't
exit foo2, until foo1 has run.
I've tried messing with posix signals (sigaction etc), but have the
same problem.

Has anyone tried to do this sort of thing before??

Thanks in advance,
Scott Wallace

2. Trouble with SMC Network Card

3. Stack pointer and signal handlers

4. problem with `date` and sleep mode

5. Stack dump from signal handler

6. need help with mac68k install]

7. Print Call Stack Signal Handler

8. permissions on mounted vfat files

9. Signal Handler Stack Standard?

10. Stacking up signal handlers

11. getting stack trace from signal handler U_STACK_TRACE()

12. Threads performance - allow signal handler to not call handler

13. Signal handlers are not reset after signal delivery