$ Notice: aha: no controller response reading SCSI adapter dev 1/104
$(ha=0 id=0 lun=0) block=58
$
$ Sdsk: unrecoverable error reading SCSI disk
$ Sdsk: 1 dev 1/104 (ha=0 id=1 lun=0) block 56
$
$3 HDU are installed with this controller (id 0,1 and 3). The "unreoverable
$reading" error is random on any of these HDU. We checked the jumpers (id=) on
$the HDU and a proper SCSI terminator is in place.
$
$a) We thought the Controller was defective; we did change it.
I'll bet you didn't change the controller about which the complaint
was lodged. You probably changed the controller at ID 7 instead,
by replacing the host adapter.
In SCSI parlance, "controller" doesn't mean "host adapter"; each ID
needs a controller, which is basically a CPU of some sort which
interfaces the device(s) at that SCSI ID to the SCSI bus. Since the
message points to the controller at ID 0, it's complaining about your
hard drive, not your host adapter.
If you want more details on the exact definition of "controller"
in the context of SCSI, and what LUNs and IDs mean, lurking in
comp.periphs.scsi would probably fill in the details.
Anyway, my guess is that this means that the host adapter's driver
has detected that it sent a command to the CPU which runs the hard
drive at ID 0, and that it didn't get a response back. It could
be a communication problem, or a bug in the firmware on your hard
drive, or possibly even a bug in the firmware on your 1540CF or
a bug in the ad driver. Since you say that the error moves from
one drive to another, I'd suspect it's not a physical problem with
one particular drive, and since I've used a variety of 154x
host adapters (including the 154xCF) over the years with a number
of drivers for a number of OSes, including several SCO ad drivers,
I doubt it's either of those, either.
I know you already checked termination. Check it again, just
to be sure. Try changing which device is at the end of the bus
(I'm serious; I've seen simply swapping two devices, and altering
termination accordingly, cure seemingly random errors). Power
problems can cause all kinds of odd errors in any part of the
system, though my gut feeling here is that power isn't the most
likely cause of this problem.
Do you know the rules about termination power? No, I don't
either. Try adjusting it anyway, to see if that makes a difference.
I know there's been a treatise on it posted here once or twice
before but it just won't sink into my brain.
--
----------------------------------------------------------------------------
Stephen M. Dunn, CNE, ACE, Sr. Systems Analyst, United System Solutions Inc.
104 Carnforth Road, Toronto, ON, Canada M4A 2K7 (416) 750-7946 x251