crash dump

crash dump

Post by Phillip Helbig---remove CLOTHES to rep » Thu, 26 Jun 2003 07:54:33



MACHINECHK, Machine check while in kernel mode

What is a possible cause of this?

One of my ALPHAs crashed on me.  I was 500 km away, logged in remotely.
It came back up automatically, after just the normal reboot time; the
TCPIP cluster alias failed over to a VAX in the cluster and so I could
reconnect right away. (Nice to see that something I have never tested
live---a hard crash like this, with automatic restart of all software I
want running, TCPIP failover etc---actually works as planned!)

This is a 255/233 running 7.2-1, soon to be upgraded.  (Before I do the
upgrade, I want to bring a third voting member into the cluster to
eliminate downtime (and learn some new things), thus the other thread on
DSSI allocation classes.)

The machine has been running almost continuously for over 6 years; I
think this is the first "internal" crash I've had (as opposed to reboots
caused by power failures etc).

The machine is back to normal now.

Another question: with the cluster alias pointing to the ALPHA, FTP,
both normal and anonymous, works fine.  With it pointing to the VAX,
normal FTP works fine, but anonymous doesn't.  The SYSUAF is not on the
system disk; a logical points to it on both systems.  (Presumably, this
has nothing to do with the cluster alias, but is some difference between
the machines, which I just noticed when the cluster alias shifted from
ALPHA to VAX.)

I get

   %RMS-E-PRV, insufficient privilege or file protection violation

in DISK$USER:[ANONYMOUS]TCPIP$FTP_SERVER.LOG.

The account itself, file protections etc seem OK.  They ARE the same as
when the login goes via the ALPHA.

The LOGIN.COM DOES get executed, and the login gets recorded (last login
in AUTHORIZE, for example).

At the other end, I get:

331 Guest login OK, send ident as password.
Password:
530 Login incorrect.
%TCPIP-E-FTP_LOGREJ, login request rejected
425 Session is disconnected.

Any ideas?

 
 
 

crash dump

Post by John Travel » Thu, 26 Jun 2003 09:08:10




Quote:> MACHINECHK, Machine check while in kernel mode

> What is a possible cause of this?

> One of my ALPHAs crashed on me.  I was 500 km away, logged in remotely.
> It came back up automatically, after just the normal reboot time; the
> TCPIP cluster alias failed over to a VAX in the cluster and so I could
> reconnect right away. (Nice to see that something I have never tested
> live---a hard crash like this, with automatic restart of all software I
> want running, TCPIP failover etc---actually works as planned!)

> This is a 255/233 running 7.2-1, soon to be upgraded.  (Before I do the
> upgrade, I want to bring a third voting member into the cluster to
> eliminate downtime (and learn some new things), thus the other thread on
> DSSI allocation classes.)

> The machine has been running almost continuously for over 6 years; I
> think this is the first "internal" crash I've had (as opposed to reboots
> caused by power failures etc).

> The machine is back to normal now.

A machine check is a HARDWARE detected error.
The evidence will be in the errorlog. I cannot remember offhand if the 255
is one of the machines supported in ERF, or whether it needs DECevent.
In any case the commands are similar...
$ analyse/error/include=(cpu,mac,mem) sys$errorlog:errlog.sys
$ diag/include=(cpu,mac,mem) sys$errorlog:errlog.sys

Read through the plain text comments on the errorlog entries. The cause of
the crash may, or may not, be obvious. If not, post the relevant errorlog
entries here. Someone, if not me, will give you an answer.

--
John Travell
VMS crashdump expertise for hire

http://www.travell.uk.net/

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.492 / Virus Database: 291 - Release Date: 24/06/2003

 
 
 

crash dump

Post by Paul Stu » Fri, 27 Jun 2003 08:55:43



Quote:> MACHINECHK, Machine check while in kernel mode

> What is a possible cause of this?

> One of my ALPHAs crashed on me.  I was 500 km away, logged in remotely.

The hottest June in 250 years here. My car registered 34.5 degress C most
of the way home. It normally drops a couple of degrees from the initial
reading once I get it moving, but not today.

Since we have been seeing Apple in the news this week with the "first
64 bit desktop" (anyone want a 5 year old piccie of me running a 64 bit
desktop?), it is also time to say that Macs were shutting themselves
off yesterday because of the heat.

Time to look at the hardware specs:

Mac max operating temp: 35  (all models we looked at)
Alpha DS10 Mac max operating temp: 40

:-)

 
 
 

crash dump

Post by Phillip Helbi » Fri, 27 Jun 2003 18:34:44


Quote:> > MACHINECHK, Machine check while in kernel mode

> > What is a possible cause of this?

> > One of my ALPHAs crashed on me.  I was 500 km away, logged in remotely.

> The hottest June in 250 years here. My car registered 34.5 degress C most
> of the way home. It normally drops a couple of degrees from the initial
> reading once I get it moving, but not today.

That was my first thought.  If this were the cause, however, isn't it a
bit strange that it came back up immediately---and stayed up?

A VAXstation in the same cluster didn't reboot.

Time to upgrade VMS so that I can get the temperature of the CPU via a
lexical function and write that into my status logs.  :-)

 
 
 

1. Seen in one of my older crash dumps...

This one I've just seen in one of my (older) crash dumps

  Time of system crash: 24-MAY-**** 18:15:48.99<NUL>
  VMScluster node: MARS, a AlphaServer 2100 5/250
  CPU 00 reason for Bugcheck: LOCKMGRERR, Error detected by Lock Manager
  Process currently executing on this CPU:   None
  Abs time of last event   00000000    BUFIO byte count/limit          0/0

Note the date.

just curious

-Peter

PS: this was a crash of my bootserver forced by a fast ethernet switch going
nuts. I was quite surprised to see not only the satellites go to the nirvana
but also this (FDDI based) AlphaServer.
--
Peter "EPLAN" LANGSTOEGER           Tel.    +43 1 81111-2651
Network and OpenVMS system manager  Fax.    +43 1 81111-888

A-1121 VIENNA  AUSTRIA              "I'm not a pessimist, I'm a realist"

2. 2000 keyboard, 'a' key not making good contact

3. Alpha Crash Dumping

4. IRQ problems with Asus A7V266

5. Analysis of crash dump on AXP

6. Arcserve 6 & Win95

7. crash dumps

8. Page fault

9. Crash dump from console of 4500

10. Can Satellite Node Crash-Dump into Page File on Local Disk?

11. Writing crash dumps to page file.

12. Crash dumps and the primary page file

13. System Crash Dump