TruCluster Member Boot Failure

TruCluster Member Boot Failure

Post by dort » Sat, 28 Jun 2003 03:02:27



Hi All,
I really need help with this.  I have setup the first node of tru64
5.1a cluster without any issue at all.  There are to be two nodes only
so I set up jumpers on the memory channel on node 1 to VH0 and on node
2 as VH1.  I have also successfully set up a quorum disk. The systems
are 2 8400's.  One is populated with 8x625's and 16GB ram (node 1) the
other with 4x440's and 5GB ram (node 2).

I run clu_add_member on the first node (after I booted from the node
specific boot disk) and it goes through the config wizard and tells me
to boot my member 2 node with boot -file genvmunix <node 2 boot disk>.
 It starts the boot and then after:

Starting CFS daemons
Registering CFS Services
Initializing CFSREC ICS Service
Registering CFSMSFS remote syscall interface
Registering CMS Services

I get:

cpu 2 halted
halt code =7
machine check while in PAL mode
PC =1c898

cpu 0 not halted
cpu 1 not halted
cpu 3 not halted

CPU 02 unexpected machine check through vector 0670
Processor machine check

I've tried:
Replacing the PCI shelf.
Replacing scsi cables on shared buses.
updating firmware.
switching pci locations of the memory channel.
patching the OS (and rebuilding the cluster).
Replacing memory/cpu/terminator boards.
praying to all known and unknown religious beings.
cursing.

I CANNOT get this second node up regardless of my actions.  What am I
doing wrong?  What causes this error always at the exact same
location?

I am now going to try to remove EVERYTHING from the PCI shelf except
the memory channels and try yet once again.....

Thank you

 
 
 

TruCluster Member Boot Failure

Post by dort » Wed, 02 Jul 2003 00:31:44


Thanks again to Tom Smith.

We were using MC 1.5 boards where one of them were defective.  We
simply swapped in 2 MC2 boards and teh cluster came up no problem at
all.


> Hi All,
> I really need help with this.  I have setup the first node of tru64
> 5.1a cluster without any issue at all.  There are to be two nodes only
> so I set up jumpers on the memory channel on node 1 to VH0 and on node
> 2 as VH1.  I have also successfully set up a quorum disk. The systems
> are 2 8400's.  One is populated with 8x625's and 16GB ram (node 1) the
> other with 4x440's and 5GB ram (node 2).

> I run clu_add_member on the first node (after I booted from the node
> specific boot disk) and it goes through the config wizard and tells me
> to boot my member 2 node with boot -file genvmunix <node 2 boot disk>.
>  It starts the boot and then after:

> Starting CFS daemons
> Registering CFS Services
> Initializing CFSREC ICS Service
> Registering CFSMSFS remote syscall interface
> Registering CMS Services

> I get:

> cpu 2 halted
> halt code =7
> machine check while in PAL mode
> PC =1c898

> cpu 0 not halted
> cpu 1 not halted
> cpu 3 not halted

> CPU 02 unexpected machine check through vector 0670
> Processor machine check

> I've tried:
> Replacing the PCI shelf.
> Replacing scsi cables on shared buses.
> updating firmware.
> switching pci locations of the memory channel.
> patching the OS (and rebuilding the cluster).
> Replacing memory/cpu/terminator boards.
> praying to all known and unknown religious beings.
> cursing.

> I CANNOT get this second node up regardless of my actions.  What am I
> doing wrong?  What causes this error always at the exact same
> location?

> I am now going to try to remove EVERYTHING from the PCI shelf except
> the memory channels and try yet once again.....

> Thank you


 
 
 

1. Dual boot Debian and XP "boot failure" using XP boot loader

Hi
I've installed debian on 4 logical (at least that's what i've told
cfdisk to do) partitions (should be sda9 to sda12 ??; my HW config is
120GB hd on promise uata controller, dvd on ide0 master [so for linux:
hda], cd-rw on ide1 master [so hdc]) and created a boot floppy from
which i've extracted the bootsector and placed it on my fat16 boot
partition (sda0 ?? [will Linux see my promise ata as scsi??]). Then
i've added the line c:\bootsect.lnx="Debian Linux" to XPs boot.ini.

Now i get the message "boot failure" when i select Linux from the XP
boot menu.
I guess the Linux boot loader is looking for the boot image (vmlinuz
??)?
Where is it? Or better, how can i tell this to 'bootsect.lnx' ;-)

Booting from my rescue floppy still works fine, also booting XP!

Thanx

magicpio

2. LILO and /proc/cmdline truncation

3. Boot failure - Cann′t find boot.bin

4. FTP from WinME to Linux RH6.2

5. Intel Boot Disc error - Boot failure

6. Sunday Times distorts the truth (again)

7. linux - windows xp pro - dual boot - boot.ini - bootsect.lnx failure

8. HELP: On Binary I/O, V.35, Modems, IPX, Battery Backup

9. X Window on Tru64 5.1b and TruCluster

10. Trucluster upgrade

11. Quota in Trucluster 5.1 environment

12. PostgreSQL with TruCluster

13. Sharity in TruCluster ?