I have an ancient Sun 386i which is used to booting off SCSI target 2,
partition a (/dev/sd2a). Up until recently, a similarly ancient 91 Mb
Wren III half-height SCSI hard disk performed this function. I hang
my auxiliary (non-system) disk off /dev/sd0c. The OS is SunOS 4.0.2.
Yeah .. I know .. bleeech. But that was the last OS Sun got out the door
for the 386i before they closed the line down.
The 386i is prone to accumulate internal dust, due to an engineering
screwup relating to the fans. _Especially_ in my apartment, full of dust
and cat hairs as it is. So every once in a while, I take apart the system
box and "peripheral expansion box" (marketriodish for "Sun official
shoebox") and clean as much crud out as possible. I did this on Friday
of last week, May 30.
When I put things back together, the system failed to boot. After about 10
hours of swapping parts and praying, I got the system to boot with a
_borrowed_ system disk /dev/sd2. I had tar tapes of the divergences of my
own (non-booting) system disk from its newly-installed state (i.e., links
to the non-system disk, customized initialization files, etc.), so I put my
old boot disk at the _end_ of the SCSI chain (rather than at its beginning,
where /dev/sd2 traditionally is in the 386i), and reloaded the OS there
from the original boot tape after rejumpering the old boot disk to /dev/sd0
(so SunOS would not confuse it with the _borrowed_ /dev/sd2 I was booting
On booting from the install tape, it was clear that the SCSI controller on
the motherboard _WAS_ finding the old, rejumpered boot disk. I formatted
it again (finding 8 new defects not in the original defect list), and
successfully reinstalled the 4.0.1 version of the OS (4.0.2 needs a later
set of patches, but 4.0.1 should boot OK).
Several things then became obvious. The system would boot from the
_borrowed_ boot disk, in the traditional location at the start of the SCSI
chain, just downstream of the controller on the motherboard. During boot,
the OS and SCSI controller _did_ see the rejumpered old boot disk, and
correctly reported its label during bootup. Once booted, partitions of
the old disk _could_ be mounted on empty directories of the borrowed boot
disk, which was now at the front of the SCSI chain, with the old boot disk
(reformatted and with 4.0.1 reinstalled) at its end, with the tape drive
in the middle.
Next experiment .. rejumper the old disk again, from /dev/sd0 to /dev/sd2,
put it back in the traditional boot-disk position at the beginning of the
SCSI chain, put a disk jumpered to /dev/sd0 at the end (where the old boot
disk resided while being reformatted), and try to boot.
No soap. Not only did the SCSI controller fail to find /dev/sd2, but the
LEDs on the tape drive downstream from /dev/sd2 failed to light.
Apparently, _everything_ past /dev/sd2 was unavailable to the SCSI
controller. Tried to boot from the boot tape _explicitly_ ("b st(0,4,0)")
.. no soap.
Here's the paradox: The old disk is _perfectly_ accessable to the SCSI
controller (booting aside) at the _end_ of the SCSI chain. Its label, disk
type, and target are correctly identified as /dev/sd0. But rejumpered to
/dev/sd2 and placed at the _beginning_ of the SCSI chain, the SCSI controller
finds neither it nor anything downsteam of it. Is that wierd or what ??
I know what you're thinking .. "what if this guy jumpered the disk to
/dev/sd2 wrong?". Well, not only was I working from a copy of the original
Sun tech sheet about this specific drive, but I also had dead specimens of
the same exact drive to examine .. all used as boot drives, and all jumpered
_EXACTLY_ as my disk was when the controller failed to find it.
In addition, I _checked_ the fit of the internal SCSI cable _carefully_ each
time I swapped drives. It fit into the 50-pin male connector of the drive's
imbedded SCSI adapter _tightly_, both with the working borrowed boot drive
and with my own. I made the same checks with the power supply fitting.
Can some kind soul tell me what the _HELL_ is going on here ?!?!? If the
disk or adapter were simply toast, either the controller would _never_
find it, or format would fail, or both. I've seen this before, on
verifiably dead drives. It seems to me I have a potentially workable boot
disk here, if I could only figure out _WHAT_ has gone wrong.