Hello,
We are experiencing problems with one of our SCO system and I am hoping
someone here has some ideas.
This description is kinda long so please bear with me :).
System: Compaq Proliant 4000, 2 x 586/66's, 196MB RAM
Proliant Storage system with 2 2GB mirrored SCSI drives.
Computone Intelliport II EX
SCO Open Server Network 3.0
SCO MPX
Compaq EFS 1.9.1
Background:
The server is being used run Medical Manager (a medical database/management
program with two datasets) and Word Perfect 5.1. We have 8 remote sites
which use PC's and TCP/IP to connect (~45 users).
None of the system resources (tables, memory, streams, etc...) are even close
to being used up.
The system had been running for 3-4 months. We occasionally got a locked PC
when trying to FTP. This is a known bug and is supposedly fixed with
SLS NET382E. We then started to have system problems, due perhaps to
increased usage?
System Lockups:
The system froze, totally locked, no errors. Hoping that it was just an
isolated incident, we rebooted. Within a day, the system went down again,
this time with a PANIC: lock timeout. I immediately found a reference to
'lock timeout panic' on CompuServe. This is a known bug with fast
machines and MPX. I found SLS UOD393C which is supposed to fix/upgrade MPX.
I brought the system down and applied NET382E (to fix PC FTP lockups) and
UOD393C (MPX fix). After bringing the system backup, it started working
erratically. The second Medical Manager dataset kept locking. Users in
the first dataset and WP5.1 were unaffected. No errors were found, but
the second dataset continued to lock (every 15-20 minutes).
To eliminate possible hardware failure, I swapped out the Proliant with an
identical Proliant (didn't swap hard drives). The problems continued.
I brought down the system and removed NET382E and UOD393C. The system started
working normally again.
I crossed my fingers and prayed :).
The system ran OK for a couple of weeks and then we had another
'lock timeout panic', followed later by a total system freeze. Keep in mind
that this is still the 2nd Proliant, which rules out any hardware failures.
I again researched this 'lock timeout panic' and found an manual fix
(IT script). I followed the instructions and using _fst, patched a kernel
driver. I relinked the kernel and cautiously relaxed.
The system ran several days and then another 'lock timeout panic' occurred :(.
This time there was also an error indicated by the Compaq health daemon:
! NMI - Automatic Server Recovery timer expiration - Hour 13 - 7/7/95
>! ASR detected by ROM - Hour 13 - 7/7/95
The UOD393C sounds like it should fix the problems, but I am afraid to try it
again. I am not certain that I installed it in the correct order. Here is
the order used:
SCO -> MPX -> Compaq EFS 1.9.1 -> Computone Driver -> UOD393C
Unfortunately, the damn system is becoming critical and I can't afford to
experiment to much.
I posted a similar message on CompuServe with not much success.
I am not sure where to turn next, perhaps a call to SCO support :(.
Any ideas????
Thanks, Mike
--