Help!!!--The Master Daemon Died without any dump files.

Help!!!--The Master Daemon Died without any dump files.

Post by caro » Sun, 11 Nov 2001 00:52:05



Hi Everyone,

Recently we upgraded our OS from Solaris 2.6(32-bit) to 2.8(64-bit).
Informix was also upgraded from 7.31.UC3 to 7.31.FC6. After the
upgrading, our server crashed a lot of times. The message in the log
file at the time is as following:

19:01:33  Checkpoint Completed:  duration was 0 seconds.
19:05:05  The Master Daemon Died
19:05:05  PANIC: Attempting to bring system down

There is no any dump file generated. Informix engineers said that they
couldn't do analysis on the case without dump files.

I know that the situation occurs if the main 'oninit' process is down
or is killed. But I don't know what could cause the main 'oninit'
down. Does anyone have any idea about the problem?

Thanks in advance.

 
 
 

Help!!!--The Master Daemon Died without any dump files.

Post by DJW » Tue, 13 Nov 2001 00:04:10


Perhaps the machine is being shutdown without first shutting down Informix?
You need to add an onmode -yuck to the shutdown scripts for the machine.

Perhaps something or someone is killing it?

What does /var/adm/messages say about that time?


Quote:> Hi Everyone,

> Recently we upgraded our OS from Solaris 2.6(32-bit) to 2.8(64-bit).
> Informix was also upgraded from 7.31.UC3 to 7.31.FC6. After the
> upgrading, our server crashed a lot of times. The message in the log
> file at the time is as following:

> 19:01:33  Checkpoint Completed:  duration was 0 seconds.
> 19:05:05  The Master Daemon Died
> 19:05:05  PANIC: Attempting to bring system down

> There is no any dump file generated. Informix engineers said that they
> couldn't do analysis on the case without dump files.

> I know that the situation occurs if the main 'oninit' process is down
> or is killed. But I don't know what could cause the main 'oninit'
> down. Does anyone have any idea about the problem?

> Thanks in advance.


 
 
 

Help!!!--The Master Daemon Died without any dump files.

Post by Alpac.. » Tue, 13 Nov 2001 11:42:04


--part1_145.46f3aea.2920d74c_boundary
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit

Carol,

Faced similar problem with IDS 7.31 on SCO Unix 5.0.5 on a Compaq
Proliant-5000 machine. This would happen after every three weeks or so. All
three vendors (Informix, SCO, Compaq) were pointing fingers at each other and
it never got resolved!!!

Good Luck.

> Perhaps the machine is being shutdown without first shutting down Informix?
> You need to add an onmode -yuck to the shutdown scripts for the machine.

> Perhaps something or someone is killing it?

> What does /var/adm/messages say about that time?



> > Hi Everyone,

> > Recently we upgraded our OS from Solaris 2.6(32-bit) to 2.8(64-bit).
> > Informix was also upgraded from 7.31.UC3 to 7.31.FC6. After the
> > upgrading, our server crashed a lot of times. The message in the log
> > file at the time is as following:

> > 19:01:33  Checkpoint Completed:  duration was 0 seconds.
> > 19:05:05  The Master Daemon Died
> > 19:05:05  PANIC: Attempting to bring system down

> > There is no any dump file generated. Informix engineers said that they
> > couldn't do analysis on the case without dump files.

> > I know that the situation occurs if the main 'oninit' process is down
> > or is killed. But I don't know what could cause the main 'oninit'
> > down. Does anyone have any idea about the problem?

> > Thanks in advance.

--part1_145.46f3aea.2920d74c_boundary
Content-Type: text/html; charset="US-ASCII"
Content-Transfer-Encoding: 7bit

<HTML><FONT FACE=arial,helvetica><FONT  SIZE=2>Carol,
<BR>
<BR>Faced similar problem with IDS 7.31 on SCO Unix 5.0.5 on a Compaq Proliant-5000 machine. This would happen after every three weeks or so. All three vendors (Informix, SCO, Compaq) were pointing fingers at each other and it never got resolved!!!
<BR>
<BR>Good Luck.
<BR>
<BR>
<BR><BLOCKQUOTE TYPE=CITE style="BORDER-LEFT: #0000ff 2px solid; MARGIN-LEFT: 5px; MARGIN-RIGHT: 0px; PADDING-LEFT: 5px">Perhaps the machine is being shutdown without first shutting down Informix?
<BR>You need to add an onmode -yuck to the shutdown scripts for the machine.
<BR>
<BR>Perhaps something or someone is killing it?
<BR>
<BR>What does /var/adm/messages say about that time?
<BR>

<BR>&gt; Hi Everyone,
<BR>&gt;
<BR>&gt; Recently we upgraded our OS from Solaris 2.6(32-bit) to 2.8(64-bit).
<BR>&gt; Informix was also upgraded from 7.31.UC3 to 7.31.FC6. After the
<BR>&gt; upgrading, our server crashed a lot of times. The message in the log
<BR>&gt; file at the time is as following:
<BR>&gt;
<BR>&gt; 19:01:33 &nbsp;Checkpoint Completed: &nbsp;duration was 0 seconds.
<BR>&gt; 19:05:05 &nbsp;The Master Daemon Died
<BR>&gt; 19:05:05 &nbsp;PANIC: Attempting to bring system down
<BR>&gt;
<BR>&gt; There is no any dump file generated. Informix engineers said that they
<BR>&gt; couldn't do analysis on the case without dump files.
<BR>&gt;
<BR>&gt; I know that the situation occurs if the main 'oninit' process is down
<BR>&gt; or is killed. But I don't know what could cause the main 'oninit'
<BR>&gt; down. Does anyone have any idea about the problem?
<BR>&gt;
<BR>&gt; Thanks in advance.
<BR>
<BR></BLOCKQUOTE>
<BR>
<BR></FONT></HTML>

--part1_145.46f3aea.2920d74c_boundary--

 
 
 

Help!!!--The Master Daemon Died without any dump files.

Post by caro » Wed, 14 Nov 2001 00:03:38


No. The server was up and running at that time. And there is no
strange message in /var/adm/messages. I also don't think that it's
killed by anyone.

The situation never happened before the OS upgrading. After the
upgrading, the server crashed many times. Sometimes even three times a
day. For there was no dump files and no alerts, we even couldn't get
paged. It's one of the highest severity production servers. But we
really don't know what to do now.

For reference, here is the /etc/system file:
 set rlim_fd_max=65536
 set enable_sm_wa = 1
 set shmsys:shminfo_shmmax=536870912
 set semsys:seminfo_semmap=64
 set semsys:seminfo_semmni=4096
 set semsys:seminfo_semmns=4096
 set semsys:seminfo_semmnu=4096
 set semsys:seminfo_semume=100
 set semsys:seminfo_semmsl=500
 set shmsys:shminfo_shmmin=100
 set shmsys:shminfo_shmmni=500
 set shmsys:shminfo_shmseg=100

Any information from you will be much appreciated.


> Perhaps the machine is being shutdown without first shutting down Informix?
> You need to add an onmode -yuck to the shutdown scripts for the machine.

> Perhaps something or someone is killing it?

> What does /var/adm/messages say about that time?



> > Hi Everyone,

> > Recently we upgraded our OS from Solaris 2.6(32-bit) to 2.8(64-bit).
> > Informix was also upgraded from 7.31.UC3 to 7.31.FC6. After the
> > upgrading, our server crashed a lot of times. The message in the log
> > file at the time is as following:

> > 19:01:33  Checkpoint Completed:  duration was 0 seconds.
> > 19:05:05  The Master Daemon Died
> > 19:05:05  PANIC: Attempting to bring system down

> > There is no any dump file generated. Informix engineers said that they
> > couldn't do analysis on the case without dump files.

> > I know that the situation occurs if the main 'oninit' process is down
> > or is killed. But I don't know what could cause the main 'oninit'
> > down. Does anyone have any idea about the problem?

> > Thanks in advance.

 
 
 

1. Master Daemon Died, then setting SHMTOTAL hung instance

Hi folks,

This little gem, "Master Daemon Died", has recently afflicted one of
our Linux machines. It appears when this happens it does not dump any
analysis files. The only clue I have is that for the preceeding hour
the instance grabbed more and more shared memory. Until it grabbed one
too many, (at least the crash happened 1 second after the last grab).
The final grab meant that it was trying to operate with approx 160Mb.

I've trawled the archive and although others have had similar problems
no-one seems to have posted a reason /solution.

The machine is Compaq with 128Mb memory and 135Mb swap, a single CPU,
20Gb disk. We are running SuSE 8.1 Linux and & 7.31.UD2 IDS. The
Kernel parameters are all either greater or equal to the Release note
requirements.

I have tried to re-create the scenario but have failed to do so
inspite of running a series of applications in various combinations.

Does anyone know what is happening and / or have any advice as to how
to avert a re-occurance?

In an attempt to limit the amount of memory that could be grabbed I
tried setting SHMTOTAL to allow only 2 extra virtual segments to be
grabbed, (calculation including the initial R,V & M class sizes).
This had the unhappy effect of hanging the whole instance to the
extent that onmode -ky did not take effect. I was eventually forced to
kill the oninit(s) to bring the system off-line.

Is this 'hanging' a known issue with setting SHMTOTAL on Linux?

TIA, Tim

2. VB6 select with a function

3. Master Daemon died ??

4. Iif (OLAP)

5. Help: DB Daemon died abnormally...

6. TableDef.RecordCount problem

7. Help, PLEASE: Runtime Error 3173 (Die, Die, Die)

8. forcing a query to use a index from VBA

9. d4gl - html daemon dying

10. Why does the Informix OnLine daemon die?

11. IDS Network Daemon apparently "died"

12. Universe - rpc daemon dying

13. Sql-Database restore without a dump-file