OSF: System hang - No error messages

OSF: System hang - No error messages

Post by Gary L. Burno » Sat, 09 Sep 1995 04:00:00



HAs anyone experienced total system hangup, including external tcp/ip
connections, with no error messages in OS5?

Only solution is to do a cold reboot. (Gaaaaak)

 
 
 

OSF: System hang - No error messages

Post by Larry Phil » Tue, 12 Sep 1995 04:00:00



Quote:> HAs anyone experienced total system hangup, including external tcp/ip
> connections, with no error messages in OS5?

Only once.  It was caused by a hardware flaw in an graphics
card (had to track that one down with a logic analyzer).

Quote:> Only solution is to do a cold reboot. (Gaaaaak)

I hate to say it, but it is likely hardware.  One way to make that
diagnosis more solid ...

Edit the file /etc/conf/pack.d/pit/space.c, and change the value of the
variable "num_watchdog_ticks" from 0 to 0xff, then relink the kernel
and reboot.

If the system hangs in software (for any reason), you should get an NMI
failsafe timer interrupt about 60ms after the hang, this should product
a panic dump which should point a finger at the offending piece of
software.

If it is a hardware problem (in my case the graphics card botched the
bus protocol and hung system bus), then the system will (again) just
sit there.  In this case, all you can do is start swapping peripherals
until the problem goes away :-(.

Larry

P.S.  Remember to set the num_watchdog_ticks variable back after this
      test.  Turning it on will cost you a little bit of performance.

 
 
 

OSF: System hang - No error messages

Post by Gary L. Burno » Wed, 13 Sep 1995 04:00:00


{ > Only solution is to do a cold reboot. (Gaaaaak)

{ I hate to say it, but it is likely hardware.  One way to make that
{ diagnosis more solid ...

Are you saying OS5 is more touchy than 3.2.4.2?  This _NEVER_ happened
before OS5.  (I used to use 3.2.4.2 and ODT3)

{ Edit the file /etc/conf/pack.d/pit/space.c, and change the value of the
{ variable "num_watchdog_ticks" from 0 to 0xff, then relink the kernel
{ and reboot.

Well I'll try it. :)

--

----------------------------------------------------------------------------
                  Finger me and I'll break your finger.
----------------------------------------------------------------------------
                                      |  Y?3oY3T3oY3Y?3oY3T3oY3Y3T3oY3YY?3
Gary L. Burnore                       |  Y?3oY3T3oY3Y?3oY3T3oY3Y3T3oY3YY?3
DataBasix                             |  Y?3oY3T3oY3Y?3oY3T3oY3Y3T3oY3YY?3
Santa Clara, CA                       |  Y?3 3 4 1 4 2  Y3T3 6 9 0 6 9 Y?3
                                      |     Official Proof of Purchase
============================================================================

 
 
 

OSF: System hang - No error messages

Post by Larry Phil » Thu, 14 Sep 1995 04:00:00




> { > Only solution is to do a cold reboot. (Gaaaaak)

> { I hate to say it, but it is likely hardware.  One way to make that
> { diagnosis more solid ...

> Are you saying OS5 is more touchy than 3.2.4.2?  This _NEVER_ happened
> before OS5.  (I used to use 3.2.4.2 and ODT3)

I don't know if "touchy" is the right word.  The graphics card I had a
problem with worked fine under 3.2v4.2 and hung the system under 5.0.
The reason for that was that the X server in 5.0 is faster, and drove
the graphics card harder.  3.2v4.2 wasn't able to push the card hard
enough to break it, 5.0 was.

Larry

 
 
 

OSF: System hang - No error messages

Post by Gary L. Burno » Thu, 14 Sep 1995 04:00:00


{ I don't know if "touchy" is the right word.  The graphics card I had a
{ problem with worked fine under 3.2v4.2 and hung the system under 5.0.
{ The reason for that was that the X server in 5.0 is faster, and drove
{ the graphics card harder.  3.2v4.2 wasn't able to push the card hard
{ enough to break it, 5.0 was.

Well larry when you're right you're right. (Sort of i think).  Before i
bothered to change the kernel paramater you suggested, and after one of
the many hangs, i removed the system case and re-seated all of the cards
and chips.  Hasn't crashed since.

Thanks!

Gary [hopes it doesn't crash now that he's said it didn't crash] Burnore

--

----------------------------------------------------------------------------
                   How you look depends on where you go.
----------------------------------------------------------------------------
                                      |  Y?3oY3T3oY3Y?3oY3T3oY3Y3T3oY3YY?3
Gary L. Burnore                       |  Y?3oY3T3oY3Y?3oY3T3oY3Y3T3oY3YY?3
DataBasix                             |  Y?3oY3T3oY3Y?3oY3T3oY3Y3T3oY3YY?3
Santa Clara, CA                       |  Y?3 3 4 1 4 2  Y3T3 6 9 0 6 9 Y?3
                                      |     Official Proof of Purchase
============================================================================

 
 
 

1. OSF: System hang - No error messages

Well, I have a machine which is locking up in this described fashion,
plus when it locks up like this, sometimes pressing the 'enter' key can
cause a reboot!

No warnings, nothing in /usr/adm/messages.

However, I did get this out of crash:

dumpfile = /dev/swap, namelist = /unix, outfile = stdout

mem: total = 16000k, kernel = 2760k, user = 13240k

Panic String: swapseg - Swap %s error %d on swapdev %s (%u/%u)

Unfortunately, I'm no wiser now than I was before getting this information...

        /                           *  Senior Engineer, SESI
     * /_  _  _  _  * .             *  (904)884-2442 or DSN 579-2442
    /_/ (_(/_/<_/<_/_/)_            *  The person who won't read has

2. Brother HL-10h printer

3. NFS, lockd error messages and system hang

4. kwintv in gnome

5. System starting hanging, no error messages, 5.0.4 SCO

6. /lib/modules/2.0.34/misc/lp.o: init_module: Device or resource busy

7. System Hangs - no error in messages file

8. libc / libg++ upgrade troubles

9. Mklinux/Syjet: Recoverable error messages hang system

10. Hanging OSF/1 System

11. Error message continuous error messages

12. System hangs with "NOTICE: ledma1: E_ERR_PEND" message

13. linux hangs at ide0 without an error message