SunOS 5.6 crashes

SunOS 5.6 crashes

Post by Graham C. Wellin » Tue, 21 Sep 1999 04:00:00



We are setting up Network Management Systems using Sun Ultra 5's
with Solaris 2.6 (SunOS 5.6) and HP OpenView 5.1, and have found that
the machines will occasionally lockup. This has occurred on two
separate machines which are due to be sent to our customer next week.
We have setup several similar machines in the past but this is the
first time we have experienced this problem, although because of other
delays this is the first time we've done extensive long term testing.
One other piece of info is that we have to install a memory upgrade
(purchased from Sun) and a SCSI card (also purchased from Sun) before
we install the OS (the ultra 5's come with 2.7), HP OpenView, patches
and our application.

The "crash" appears to occur as a window is being moved across the
desktop or is being closed. After the crash the system does not
respond to any keyboard or mouse input. We can Telnet to the machine
and can observe that the Xsun process is hogging the CPU (approx 99%).
From the Telnet session we can usually perform an orderly shutdown
using sync ..sync ..reboot, but on a few occassions we've had to power
cycle, although this is usually after the machine has been in the
locked up state for several hours. We thought the culprits might be:
1) a bad install of RAM DIMM and/or SCSI card,
2) running the "Top" utility to observe process activity,
3) something to do with HP OpenView.

We think (1) is unlikely if it's happening in both machines. We thought (2)
was it until we had a crash when Top wasn't running. It could be (3) as the
HPOV background processes are running all the time
but the question remains why does the Xsun process CPU utilisation
go through the roof? Xsun normally occupies less than 0.5% of the CPU.

Anyone else out there seen this/know what it is?
Thanks,

 
 
 

SunOS 5.6 crashes

Post by Rick Stikker » Tue, 21 Sep 1999 04:00:00



> We are setting up Network Management Systems using Sun Ultra 5's
> with Solaris 2.6 (SunOS 5.6) and HP OpenView 5.1, and have found that
> the machines will occasionally lockup. This has occurred on two
> separate machines which are due to be sent to our customer next week.
> We have setup several similar machines in the past but this is the
> first time we have experienced this problem, although because of other
> delays this is the first time we've done extensive long term testing.
> One other piece of info is that we have to install a memory upgrade
> (purchased from Sun) and a SCSI card (also purchased from Sun) before
> we install the OS (the ultra 5's come with 2.7), HP OpenView, patches
> and our application.

> The "crash" appears to occur as a window is being moved across the
> desktop or is being closed. After the crash the system does not
> respond to any keyboard or mouse input. We can Telnet to the machine
> and can observe that the Xsun process is hogging the CPU (approx 99%).
> From the Telnet session we can usually perform an orderly shutdown
> using sync ..sync ..reboot, but on a few occassions we've had to power
> cycle, although this is usually after the machine has been in the
> locked up state for several hours. We thought the culprits might be:
> 1) a bad install of RAM DIMM and/or SCSI card,
> 2) running the "Top" utility to observe process activity,
> 3) something to do with HP OpenView.

> We think (1) is unlikely if it's happening in both machines. We thought (2)
> was it until we had a crash when Top wasn't running. It could be (3) as the
> HPOV background processes are running all the time
> but the question remains why does the Xsun process CPU utilisation
> go through the roof? Xsun normally occupies less than 0.5% of the CPU.

> Anyone else out there seen this/know what it is?
> Thanks,


We had a very similar problem.  It disappeared when we installed patch
105362-19.

    Rick

 
 
 

1. SunOS 5.6/SUNW,Ultra-2: fails to core dump after crash

        I have a SUNW,Ultra-2 that keeps crashing. Here is
the console transaction....

{1} ok go
Fast Data Access MMU Miss
{1} ok sync
panic[cpu1]/thread=0x30043e80: zero
syncing file systems...BAD TRAP: cpu=1 type=0x31 rp=0xfffb18d8 addr=0x10 mmu_fsr
=0x0
sched: trap type = 0x31
addr=0x10
pid=0, pc=0x10058208, sp=0xfffb1968, tstate=0x4480001e06, context=0x0
g1-g7: 300, 1042ff94, 850000000ea56636, 22, f004052c, fffeff68, 30043e80
panic[cpu1]/thread=0x30043e80: trap
10841 static and sysmap kernel pages
   65 dynamic kernel data pages
  716 kernel-pageable pages
    0 segkmap kernel pages
    0 segvn kernel pages
    0 current user process pages
11622 total pages (11622 chunks)

dumping to vp 601b690c, offset 863744
BAD TRAP: cpu=1 type=0x31 rp=0xfffb1110 addr=0x0 mmu_fsr=0x0
BAD TRAP occurred in module "sd" due to an illegal access to a user address.
sched: trap type = 0x31
pid=0, pc=0x601c3118, sp=0xfffb11a0, tstate=0x1e05, context=0x0
g1-g7: 602252d0, 6006e9c8, c3, 3, f004052c, fffeff68, 30043e80
panic[cpu1]/thread=0x30043e80: trap
Dump Aborted.
{1} boot
....
Sun Ultra 2 UPA/SBus (2 X UltraSPARC-II 296MHz), No Keyboard
OpenBoot 3.7, 768 MB memory installed, Serial #9430213.
Ethernet address 8:0:20:8f:e4:c5, Host ID: 808fe4c5.
....

successful boot, but now savecore has nothing to grab...

        Any clues appreciated...

   _____________________________________________________________________

  | Systems Programmer          University of California, Davis         |
  | Unix Specialist             BCH Technical Services                  |

2. Printing problems

3. SunOS 5.6 / x86: mysterious crashes

4. Newbie Socket Question

5. rsh won't work on SunOS 5.5.1 to SunOS 5.6

6. HP Tape Drive Problem -- PLEASE HELP

7. CDE: 1.0.1 (SunOS 5.5.1) vs 1.1 (SunOS 5.6)

8. Parent to child communications

9. Reading "raw" Disks w/ Gnat on SunOS 5.6

10. Log Files SunOS 5.6

11. buffer overflow in sparc sunOS 5.6

12. SunOS 5.6 - resource limits and question about ISS patch

13. Reading "raw" Disks w/ Gnat on SunOS 5.6