Multi-threaded application thread stops receiving signals

Multi-threaded application thread stops receiving signals

Post by Willam Lew » Fri, 05 Apr 2002 06:45:08



We are currently running a multi-threaded application under 64-bit
Solaris 8 on E420 server.  These threads manage client connections to
a the server.  We keep a single thread to manage the log files and the
connection threads.  it wakes up periodically and checks to see if the
connection threads have been idle for more that 5 minutes.  If so, it
sends a signal to close the connection.

The problem we are seeing is that this management thread processes
fine for a while but eventually gets to the point where any signals
sent to it never arrive.  Because the signals never process it never
manages the connection threads or the log files after it gets into
this state.

This Application appears to run fine when compiled and run on Solaris
2.6.  Has anyone seen any problems with threaded apps on Solaris 8 or
e420's?  We have just applied the most recent patch sets to see if it
would solve the problem.

thanks,
Bill Lewis

 
 
 

Multi-threaded application thread stops receiving signals

Post by Casper H.S. Di » Fri, 05 Apr 2002 17:19:45



>We are currently running a multi-threaded application under 64-bit
>Solaris 8 on E420 server.  These threads manage client connections to
>a the server.  We keep a single thread to manage the log files and the
>connection threads.  it wakes up periodically and checks to see if the
>connection threads have been idle for more that 5 minutes.  If so, it
>sends a signal to close the connection.

Check the signal state with psig.
Try running with the alternate thread library (/usr/lib/lwp/libthread.so.1).

Try the libthread patch (108827-20) that fixes the following bugs:

(from 108827-10)

4368163 ypserv starts hundreds of ypserv processes all in defunct-status
4300228 threaded process grows tired of receiving signals

Casper
--
Expressed in this posting are my opinions.  They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.

 
 
 

Multi-threaded application thread stops receiving signals

Post by Willam Lew » Fri, 12 Apr 2002 02:50:40



*snip*

Thanks for the info!

Quote:> Check the signal state with psig.
> Try running with the alternate thread library (/usr/lib/lwp/libthread.so.1).

This is the only thing we havent tried yet.

Quote:> Try the libthread patch (108827-20) that fixes the following bugs:

We have run in this recent set of patches for this fix.  The problem
still persists.

This process does not exhibit this behaviour under Solaris 2.6.  When
we moved to Solaris 8 it emerged.  We have tried recompiling under
both operating systems and we still have this issue.  I will recommend
we try the alternate thread library.  Was there a difference in the
thread libraries between 2.6 and 2.8?

thanks,
Bill Lewis

 
 
 

Multi-threaded application thread stops receiving signals

Post by Casper H.S. Di » Fri, 12 Apr 2002 06:31:40



>We have run in this recent set of patches for this fix.  The problem
>still persists.

What does "psig" say about the processes with these problems?

Did you try running against liblwp?

Quote:>This process does not exhibit this behaviour under Solaris 2.6.  When
>we moved to Solaris 8 it emerged.  We have tried recompiling under
>both operating systems and we still have this issue.  I will recommend
>we try the alternate thread library.  Was there a difference in the
>thread libraries between 2.6 and 2.8?

Lots of bug fixes.

You're not using signal() by change?

Casper
--
Expressed in this posting are my opinions.  They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.

 
 
 

Multi-threaded application thread stops receiving signals

Post by Willam Lew » Sun, 14 Apr 2002 03:46:49




> Did you try running against liblwp?

We are now running the process against this library and everything
seems fine.  We have run for several days and it looks as if the
problem has cleared up.

Thanks again,
Bill Lewis

 
 
 

1. malloc()/free() hangs in a multi-threaded application

Our multi-threaded program is written in C and running on Solaris 2.5.

It runs great most of the time, but occasionally - after a day, with >= 50
active threads - it hangs. dbx shows most of the threads waiting for the
same system mutex lock in malloc() or free().

We do use mutex lock in our program, but it's apparently different from the
one malloc() or free() waiting for. I really can't figure out what causes
the system mutex lock not being freed. If there's not enough memory, why
doesn't malloc() return?  Anybody has any clue on it? Thank you very very
much.

Ying Zhao

(dbx) where

=>[1] _lwp_sema_wait(0xe430dea0, 0xeea0be08, 0xef7467d8, 0x105, 0xeea0be08,
0xe7
20be08), at 0xef5b9a4c
  [2] _park(0xe430de08, 0x1, 0xe430de44, 0xe430dea0, 0xe430de38, 0x0), at
0xef72
6804
  [3] _swtch(0x1000, 0x8000, 0xe430de34, 0xe430de08, 0xe430de44,
0xe430de38), at
 0xef7266a4
  [4] _mutex_suspend_lock(0xef613b40, 0x1, 0xfffeffff, 0xef613b4f,
0xef613b4e, 0
x1), at 0xef727718
  [5] pthread_mutex_lock(0xef613b40, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xef727570
  [6] malloc(0x1388, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xef5cb1b4
      ....  our application functions ...

(dbx) where

=>[1] _lwp_sema_wait(0xe3d01ea0, 0xe720be08, 0xef7467d8, 0x105, 0xe720be08,
0xe3
101e08), at 0xef5b9a4c
  [2] _park(0xe3d01e08, 0x1, 0xe3d01e44, 0xe3d01ea0, 0xe3d01e38, 0x0), at
0xef72
6804
  [3] _swtch(0x1000, 0x8000, 0xe3d01e34, 0xe3d01e08, 0xe3d01e44,
0xe3d01e38), at
 0xef7266a4
  [4] _mutex_suspend_lock(0xef613b40, 0x1, 0xfffeffff, 0xef613b4f,
0xef613b4e, 0
x1), at 0xef727718
  [5] pthread_mutex_lock(0xef613b40, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xef727570
  [6] free(0x15850e8, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xef5cc034
        ....  our own functions ...

2. Unix Admin Resources (JOBS, RESUMES, LINKS)!

3. dlclose in a multi-threaded application

4. Upgrade to 0.99.14 increases TeX starting time

5. another problem with signal in a multi-threaded environment

6. popclient: "doPOP3: socket: connection refused"

7. many open files in multi-threaded application and segmentation fault?

8. proFTP Problem

9. setsockopt() in multi-threaded application

10. Problems with debugging multi-threaded application with gdb

11. Leaks in multi-threaded application

12. pb: catching signals in a multi-threaded process

13. Qt multi-threaded application using OpenGL