Application takes 100% CPU on multiprocessor when Leaving a critical section and throwing an exception

Application takes 100% CPU on multiprocessor when Leaving a critical section and throwing an exception

Post by Zaltman Blero » Fri, 01 Nov 2002 02:30:33



I was wondering if anyone ever had that problem on a multiprocessor (NT
Server 4):
- One thread is in a Critical Section (EnterCriticalSection was called).
- It checks for wathever needs the protected variable.
- It calls LeaveCriticalSection.
- It throws an exception (that is caught by the caller).

On a multiprocessor, the call to LeaveCriticalSection sometimes take 4-5
seconds to complete and while it's completing, the process is using about
100% of the CPU.  The funny thing is that if I add a Sleep(1) between the
call to LeaveCriticalSection and the throw, everything works fine.

Any idea what could cause that?  Is there some issues with exceptions and
LeaveCriticalSection?

OK thanks,
  Zaltman

 
 
 

Application takes 100% CPU on multiprocessor when Leaving a critical section and throwing an exception

Post by Ken Wickes [MS » Fri, 01 Nov 2002 06:44:23


Have you broken in with the de* and seen what is going on?

--
This posting is provided "AS IS" with no warranties, and confers no rights.


Quote:> I was wondering if anyone ever had that problem on a multiprocessor (NT
> Server 4):
> - One thread is in a Critical Section (EnterCriticalSection was called).
> - It checks for wathever needs the protected variable.
> - It calls LeaveCriticalSection.
> - It throws an exception (that is caught by the caller).

> On a multiprocessor, the call to LeaveCriticalSection sometimes take 4-5
> seconds to complete and while it's completing, the process is using about
> 100% of the CPU.  The funny thing is that if I add a Sleep(1) between the
> call to LeaveCriticalSection and the throw, everything works fine.

> Any idea what could cause that?  Is there some issues with exceptions and
> LeaveCriticalSection?

> OK thanks,
>   Zaltman


 
 
 

Application takes 100% CPU on multiprocessor when Leaving a critical section and throwing an exception

Post by Zaltman Blero » Fri, 01 Nov 2002 22:07:50


I couldn't run a de* on that machine (because of political/client
issues) and I had to use traces to pinpoint the problem.  One thing that I
forgot to mention is that the process is set to run on the first CPU only
(AffinityMask).  The problem can only be reproduced under a heavy load (with
many clients communicating with TCP).  Here's the pseudo code of the two
problematic threads:

Thread 1
AlertPause( DWORD dwNbSec )
{
    while ( dwNbSec_not_reached() )
    {
        EnterCriticalSection( &gCS );  // gCS is a global CS in this case
        if ( gbIsAlerted )  // gbIsAlerted is a global var in this case
        {
            LeaveCriticalSection( &gCS );  // (#1)
            // The Sleep(1) was added here
            throw Alerted();  // Caught by the caller in a try/catch
        }
        LeaveCriticalSection( &gCS );  // (#2)
        Sleep_a_bit();
    }

Quote:}

Thread 2
Alert()
{
    EnterCriticalSection( &gCS );
    gbIsAlerted = true;
    LeaveCriticalSection( &gCS );  // (#3)

Quote:}

Under heavy load, the call to LeaveCriticalSection (#3) can take 4-5
seconds.  The OS is NT Server 4 SP6.

  Zaltman



> Have you broken in with the de* and seen what is going on?

> --
> This posting is provided "AS IS" with no warranties, and confers no
rights.



> > I was wondering if anyone ever had that problem on a multiprocessor (NT
> > Server 4):
> > - One thread is in a Critical Section (EnterCriticalSection was called).
> > - It checks for wathever needs the protected variable.
> > - It calls LeaveCriticalSection.
> > - It throws an exception (that is caught by the caller).

> > On a multiprocessor, the call to LeaveCriticalSection sometimes take 4-5
> > seconds to complete and while it's completing, the process is using
about
> > 100% of the CPU.  The funny thing is that if I add a Sleep(1) between
the
> > call to LeaveCriticalSection and the throw, everything works fine.

> > Any idea what could cause that?  Is there some issues with exceptions
and
> > LeaveCriticalSection?

> > OK thanks,
> >   Zaltman

 
 
 

Application takes 100% CPU on multiprocessor when Leaving a critical section and throwing an exception

Post by Ken Wickes [MS » Sat, 02 Nov 2002 05:11:29


I almost wonder if the loop in AlertPause is spinning so fast it is starving
other threads. What you really want here is a
SetEvent()/WaitForSingleObject().

Another possibility is that when you call LeaveCriticalSection, if a thread
is waiting on that critsec, then it will switch to that thread so it may be
a while before you see the trace.

--
This posting is provided "AS IS" with no warranties, and confers no rights.


> I couldn't run a de* on that machine (because of political/client
> issues) and I had to use traces to pinpoint the problem.  One thing that I
> forgot to mention is that the process is set to run on the first CPU only
> (AffinityMask).  The problem can only be reproduced under a heavy load
(with
> many clients communicating with TCP).  Here's the pseudo code of the two
> problematic threads:

> Thread 1
> AlertPause( DWORD dwNbSec )
> {
>     while ( dwNbSec_not_reached() )
>     {
>         EnterCriticalSection( &gCS );  // gCS is a global CS in this case
>         if ( gbIsAlerted )  // gbIsAlerted is a global var in this case
>         {
>             LeaveCriticalSection( &gCS );  // (#1)
>             // The Sleep(1) was added here
>             throw Alerted();  // Caught by the caller in a try/catch
>         }
>         LeaveCriticalSection( &gCS );  // (#2)
>         Sleep_a_bit();
>     }
> }

> Thread 2
> Alert()
> {
>     EnterCriticalSection( &gCS );
>     gbIsAlerted = true;
>     LeaveCriticalSection( &gCS );  // (#3)
> }

> Under heavy load, the call to LeaveCriticalSection (#3) can take 4-5
> seconds.  The OS is NT Server 4 SP6.

>   Zaltman



> > Have you broken in with the de* and seen what is going on?

> > --
> > This posting is provided "AS IS" with no warranties, and confers no
> rights.



> > > I was wondering if anyone ever had that problem on a multiprocessor
(NT
> > > Server 4):
> > > - One thread is in a Critical Section (EnterCriticalSection was
called).
> > > - It checks for wathever needs the protected variable.
> > > - It calls LeaveCriticalSection.
> > > - It throws an exception (that is caught by the caller).

> > > On a multiprocessor, the call to LeaveCriticalSection sometimes take
4-5
> > > seconds to complete and while it's completing, the process is using
> about
> > > 100% of the CPU.  The funny thing is that if I add a Sleep(1) between
> the
> > > call to LeaveCriticalSection and the throw, everything works fine.

> > > Any idea what could cause that?  Is there some issues with exceptions
> and
> > > LeaveCriticalSection?

> > > OK thanks,
> > >   Zaltman

 
 
 

Application takes 100% CPU on multiprocessor when Leaving a critical section and throwing an exception

Post by Zaltman Blero » Sat, 02 Nov 2002 23:16:49


OK, the second comment could explain why the trace tells me that it took so
long for the LeaveCriticalSection.  In the AlertPause, we already do a
WaitForSingleObject on a semaphore (not an event).  The Sleep_a_bit()
function reprensented that.  I'm just wondering why the Sleep() call between
the LeaveCriticalSection()#1 and the throw() seems to fix the problem.

OK, thanks for the help,
  Zaltman



> I almost wonder if the loop in AlertPause is spinning so fast it is
starving
> other threads. What you really want here is a
> SetEvent()/WaitForSingleObject().

> Another possibility is that when you call LeaveCriticalSection, if a
thread
> is waiting on that critsec, then it will switch to that thread so it may
be
> a while before you see the trace.

> --
> This posting is provided "AS IS" with no warranties, and confers no
rights.



> > I couldn't run a de* on that machine (because of political/client
> > issues) and I had to use traces to pinpoint the problem.  One thing that
I
> > forgot to mention is that the process is set to run on the first CPU
only
> > (AffinityMask).  The problem can only be reproduced under a heavy load
> (with
> > many clients communicating with TCP).  Here's the pseudo code of the two
> > problematic threads:

> > Thread 1
> > AlertPause( DWORD dwNbSec )
> > {
> >     while ( dwNbSec_not_reached() )
> >     {
> >         EnterCriticalSection( &gCS );  // gCS is a global CS in this
case
> >         if ( gbIsAlerted )  // gbIsAlerted is a global var in this case
> >         {
> >             LeaveCriticalSection( &gCS );  // (#1)
> >             // The Sleep(1) was added here
> >             throw Alerted();  // Caught by the caller in a try/catch
> >         }
> >         LeaveCriticalSection( &gCS );  // (#2)
> >         Sleep_a_bit();
> >     }
> > }

> > Thread 2
> > Alert()
> > {
> >     EnterCriticalSection( &gCS );
> >     gbIsAlerted = true;
> >     LeaveCriticalSection( &gCS );  // (#3)
> > }

> > Under heavy load, the call to LeaveCriticalSection (#3) can take 4-5
> > seconds.  The OS is NT Server 4 SP6.

> >   Zaltman



> > > Have you broken in with the de* and seen what is going on?

> > > --
> > > This posting is provided "AS IS" with no warranties, and confers no
> > rights.



> > > > I was wondering if anyone ever had that problem on a multiprocessor
> (NT
> > > > Server 4):
> > > > - One thread is in a Critical Section (EnterCriticalSection was
> called).
> > > > - It checks for wathever needs the protected variable.
> > > > - It calls LeaveCriticalSection.
> > > > - It throws an exception (that is caught by the caller).

> > > > On a multiprocessor, the call to LeaveCriticalSection sometimes take
> 4-5
> > > > seconds to complete and while it's completing, the process is using
> > about
> > > > 100% of the CPU.  The funny thing is that if I add a Sleep(1)
between
> > the
> > > > call to LeaveCriticalSection and the throw, everything works fine.

> > > > Any idea what could cause that?  Is there some issues with
exceptions
> > and
> > > > LeaveCriticalSection?

> > > > OK thanks,
> > > >   Zaltman

 
 
 

Application takes 100% CPU on multiprocessor when Leaving a critical section and throwing an exception

Post by Ken Wickes [MS » Sun, 03 Nov 2002 04:38:07


Well sleep() would cause an immediate thread switch to the next ready
thread, so that might have an effect.

--
This posting is provided "AS IS" with no warranties, and confers no rights.


> OK, the second comment could explain why the trace tells me that it took
so
> long for the LeaveCriticalSection.  In the AlertPause, we already do a
> WaitForSingleObject on a semaphore (not an event).  The Sleep_a_bit()
> function reprensented that.  I'm just wondering why the Sleep() call
between
> the LeaveCriticalSection()#1 and the throw() seems to fix the problem.

> OK, thanks for the help,
>   Zaltman



> > I almost wonder if the loop in AlertPause is spinning so fast it is
> starving
> > other threads. What you really want here is a
> > SetEvent()/WaitForSingleObject().

> > Another possibility is that when you call LeaveCriticalSection, if a
> thread
> > is waiting on that critsec, then it will switch to that thread so it may
> be
> > a while before you see the trace.

> > --
> > This posting is provided "AS IS" with no warranties, and confers no
> rights.



> > > I couldn't run a de* on that machine (because of political/client
> > > issues) and I had to use traces to pinpoint the problem.  One thing
that
> I
> > > forgot to mention is that the process is set to run on the first CPU
> only
> > > (AffinityMask).  The problem can only be reproduced under a heavy load
> > (with
> > > many clients communicating with TCP).  Here's the pseudo code of the
two
> > > problematic threads:

> > > Thread 1
> > > AlertPause( DWORD dwNbSec )
> > > {
> > >     while ( dwNbSec_not_reached() )
> > >     {
> > >         EnterCriticalSection( &gCS );  // gCS is a global CS in this
> case
> > >         if ( gbIsAlerted )  // gbIsAlerted is a global var in this
case
> > >         {
> > >             LeaveCriticalSection( &gCS );  // (#1)
> > >             // The Sleep(1) was added here
> > >             throw Alerted();  // Caught by the caller in a try/catch
> > >         }
> > >         LeaveCriticalSection( &gCS );  // (#2)
> > >         Sleep_a_bit();
> > >     }
> > > }

> > > Thread 2
> > > Alert()
> > > {
> > >     EnterCriticalSection( &gCS );
> > >     gbIsAlerted = true;
> > >     LeaveCriticalSection( &gCS );  // (#3)
> > > }

> > > Under heavy load, the call to LeaveCriticalSection (#3) can take 4-5
> > > seconds.  The OS is NT Server 4 SP6.

> > >   Zaltman



> > > > Have you broken in with the de* and seen what is going on?

> > > > --
> > > > This posting is provided "AS IS" with no warranties, and confers no
> > > rights.



> > > > > I was wondering if anyone ever had that problem on a
multiprocessor
> > (NT
> > > > > Server 4):
> > > > > - One thread is in a Critical Section (EnterCriticalSection was
> > called).
> > > > > - It checks for wathever needs the protected variable.
> > > > > - It calls LeaveCriticalSection.
> > > > > - It throws an exception (that is caught by the caller).

> > > > > On a multiprocessor, the call to LeaveCriticalSection sometimes
take
> > 4-5
> > > > > seconds to complete and while it's completing, the process is
using
> > > about
> > > > > 100% of the CPU.  The funny thing is that if I add a Sleep(1)
> between
> > > the
> > > > > call to LeaveCriticalSection and the throw, everything works fine.

> > > > > Any idea what could cause that?  Is there some issues with
> exceptions
> > > and
> > > > > LeaveCriticalSection?

> > > > > OK thanks,
> > > > >   Zaltman

 
 
 

1. MessageBox() does not show and takes 100% cpu

I'm running into some strange behavior with a message box and I was hoping
someone could help me out.
I have a simple Win32 gui application -- no MFC or other libraries.  In my
winMain function, I register my main window and create it (hidden).  I then
open a non-modal dialog box, check some values, and possibly call
MessageBox.  Finally, I return and start my main loop.  The dialog box is
showing fine, but if I ever have to call MessageBox before starting my main
loop, the messageBox never is displayed.  (I have not tried calling it after
my main loop).  I've tried calling it with the parent handle of NULL, the
dialog box's handle, and the Main window's handle.  All with the same
result.  The MessageBox function sounds the bell and blocks, and the
DalogBox loses focus, but I never see the MessageBox.  I've tried running
PeekMessage() loops -- didn't help.  Also, looking at the process monitor,
the program is consuming 100% of the cpu.

    What's even stranger is one time, I was clicking between windows on my
desktop, and the Messagebox showed up all of a sudden.  That only hapened
once.

    This is driving me nuts.  Any help would be greatly appreciated.  Below
is some psudo code to paint a clearer picture.

WinMain()
{
    parseCmdLine();
    RegisterClass();
    hParent = InitInstance(SW_HIDE)
    StartProcess(cmdLine);

    while(GetMessage()){
        dispatchMessage();

InitInstance(cmdShow){
    return CreateWindow(SW_HIDE);

StartProcess(cmdLine) {
    CreateDialog(hParent);

    if(error) {
        MessageBox(hParent, "Error","Error",MB_OK); /*** Does not display
Messagebox!!!*/
        return;
    }

2. how to reset eeprom in ipx?

3. fclose() taking 2 or more seconds - cpu at 100%

4. Looking for author of tapeBIOS

5. TAPI 2.1 taking up 70-100% of CPU even with hotfix

6. Sendmail 8.12.8-1.80 on RH8.0 (DNS??)

7. Application uses 100% of CPU

8. help on vectorization

9. HELP - recieve exception because system critical section was not init

10. Exceptions using Critical Sections on NT 3.51

11. Throwing exceptions in dtors called by thrown exceptions

12. 100% Processor Time Taken

13. CPU @ 100% even if filter graph stopped ?