SCSI on Sol8 - read: "not enough space"

SCSI on Sol8 - read: "not enough space"

Post by BJ Chippindal » Fri, 04 Jul 2003 00:13:20



I am writing data to a tape using a real-time OS.  Not
Solaris.  I am attempting to read it back in Solaris.
There's more than one issue involved and I am pursuing
the writing side closely as there must be some difference
THERE to cause this, but there is also this difference
between the Solaris implementation and the Windows, and
it is not well understood.

Details are bizarre so follow closely.

1.  The application is writing scan data to a tape in
     real-time at 4.2 MB/Sec.  The tape is a
     DLT.  Each scan is about 338 KBytes in size.

2.  Because the tape speeds up and slows down and
     occasionally reverses direction the application
     must queue scans and issues write commands for
     the queue contents at irregular intervals.

3.  There are 2 versions of this code, one built 2
     years ago and another more recent.  Both appear
     to function identically, call them '02 and '03.

4.  The resulting tapes are ( or appear to be so far )
     identical when read under Windows.  When read under
     Solaris the Solaris driver appears to break down at
     the scans where one fwrite ends and the next begins.
     This affects dd and the custom scan reader, but NOT
     the mt utility.  mt -f /dev/xxx fsf 1 shows exactly
     the files and EOF marks we expect to see.  The scan
     sizes are 5.5 x the Solaris 61440 transfer block
     size.

5.  The scan breakdown appears to be related to the ibs
     in the Solaris implementation, which defaults to the
     61440 maximum.  Data appear normal up to the failure.

6.  This failure only occurs in the '03 code, not the '02
     code.  There are no errors in writing.  Both versions
     are readable under Windows, the '03 code has this
     trouble in Solaris.

I am currently going through the drivers and some possible
alignment differences in the code, but thought it useful
to ask this board if there's anyone with any idea what is
going on here that can add to what I know so far.  I
suspect that there is something funny happening at the end
of the last partial block written and the beginning of
the first block in the next write, but I haven't got ANY
way to understand that at present.

Peculiarities related to read and alignment are of
special interest.

... and YES I know I should be using a disk.  I have the
disk.  I have not received Management support to fix this
problem though it has cost (wasted) almost 2 man-years to

it has given ).

It is used on AVIRIS

www.aviris.jpl.nasa.gov

respectfully
BJ

 
 
 

SCSI on Sol8 - read: "not enough space"

Post by Anthony Mandi » Fri, 04 Jul 2003 15:16:55



> I am currently going through the drivers and some possible
> alignment differences in the code, but thought it useful
> to ask this board if there's anyone with any idea what is
> going on here that can add to what I know so far.  I
> suspect that there is something funny happening at the end
> of the last partial block written and the beginning of
> the first block in the next write, but I haven't got ANY
> way to understand that at present.

> Peculiarities related to read and alignment are of
> special interest.

        I would suggest using tcopy to read both the '02 and '03
        flavours of the tapes and see if it reports any differences
        in the number of records or record sizes. If it gets read
        errors, then its a physical tape issue.

-am     ? 2003

 
 
 

SCSI on Sol8 - read: "not enough space"

Post by Stefaan A Eeckel » Fri, 04 Jul 2003 05:40:23


On Wed, 02 Jul 2003 08:13:20 -0700


>      When read under
>      Solaris the Solaris driver appears to break down at
>      the scans where one fwrite ends and the next begins.

What exactly do you mean by "break down"? Please
give the exact symptoms and the possible error codes
you're getting. Code fragments of the writer and
reader processes would also be welcome, as there is
obviously a difference in the output of '02 and '03, even
though the tapes seem identical when read on a Windows
machine.

I wouldn't think the "not enough space" is EX_NOSPACE from
userdefs.h...

--
Stefaan
--
"What is stated clearly conceives easily."  -- Inspired sales droid

 
 
 

SCSI on Sol8 - read: "not enough space"

Post by Casper H.S. Di » Fri, 04 Jul 2003 16:57:11



>4.  The resulting tapes are ( or appear to be so far )
>     identical when read under Windows.  When read under
>     Solaris the Solaris driver appears to break down at
>     the scans where one fwrite ends and the next begins.
>     This affects dd and the custom scan reader, but NOT
>     the mt utility.  mt -f /dev/xxx fsf 1 shows exactly
>     the files and EOF marks we expect to see.  The scan
>     sizes are 5.5 x the Solaris 61440 transfer block
>     size.

If tape blocks are read using a buffer that's too small, Solaris
will fail to read the block and will return an error code.

Casper
--
Expressed in this posting are my opinions.  They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.

 
 
 

SCSI on Sol8 - read: "not enough space"

Post by BJ Chippindal » Sat, 05 Jul 2003 00:35:03


Anthony

Thanks... I had been working with the RTOS for so long I
forgot that Solaris had tcopy.  It isn't a complete solution
but it reads across the error points.  It HESITATES there, but
it reads, throws no errors, and claims to have found the
correctly sized 61440 block at the point where dd and my
program fail.

Since the tapes go to 35 GB, which would end up in a single
file, I don't think I can use this as a permanent solution,
but it indicates that the tape is ok.

respectfully
BJ



>>I am currently going through the drivers and some possible
>>alignment differences in the code, but thought it useful
>>to ask this board if there's anyone with any idea what is
>>going on here that can add to what I know so far.  I
>>suspect that there is something funny happening at the end
>>of the last partial block written and the beginning of
>>the first block in the next write, but I haven't got ANY
>>way to understand that at present.

>>Peculiarities related to read and alignment are of
>>special interest.

>    I would suggest using tcopy to read both the '02 and '03
>    flavours of the tapes and see if it reports any differences
>    in the number of records or record sizes. If it gets read
>    errors, then its a physical tape issue.

> -am        ? 2003

 
 
 

SCSI on Sol8 - read: "not enough space"

Post by BJ Chippindal » Sat, 05 Jul 2003 00:50:19


Stefaan

Using dd on the tape as follows:

dd ibs=338652 obs=512 if=/dev/nrAvTape1 of=test1

results in 1608 blocks read in followed by
the message
"read: not enough space"

I just tried tcopy and it appears to have read across
the 1608 boundary.

338652 is the size of a data scan... a complete c struct.

The writing takes place on the RTOS, the code is not, as
far as I can detect, changed.  It is running from a
hard disk with different access conditions, but runs
entirely from memory so should not be an issue.  I KNOW
it has to be different, but I have not been able to detect
the difference yet.  It appears to be related to the OS
patchlevels internally and may be a bug of theirs,
SCSI tape support in any RTOS is pretty thin
and this tape drive is peculiarly ill-suited to
its task.  We are actually  "breaking" it in
terms of the original Quantum design, to
get it to work at all.  Delicate.

diff tells me that the app code for reading and writing
is the same.  I have tried mods on the reader ( it uses
the raw tape ), but haven't come up with one that works.

respectfully
BJ


> On Wed, 02 Jul 2003 08:13:20 -0700

>>     When read under
>>     Solaris the Solaris driver appears to break down at
>>     the scans where one fwrite ends and the next begins.

> What exactly do you mean by "break down"? Please
> give the exact symptoms and the possible error codes
> you're getting. Code fragments of the writer and
> reader processes would also be welcome, as there is
> obviously a difference in the output of '02 and '03, even
> though the tapes seem identical when read on a Windows
> machine.

> I wouldn't think the "not enough space" is EX_NOSPACE from
> userdefs.h...

 
 
 

SCSI on Sol8 - read: "not enough space"

Post by Stefaan A Eeckel » Sat, 05 Jul 2003 07:57:47


On Thu, 03 Jul 2003 08:50:19 -0700


> Using dd on the tape as follows:

> dd ibs=338652 obs=512 if=/dev/nrAvTape1 of=test1

> results in 1608 blocks read in followed by
> the message
> "read: not enough space"

You mean you read 1608 * 338652 bytes and then it comes
up with this message? As Casper says, the st(7D) driver
will return an error when the block size on the tape is
larger than the input buffer (from man st):

ENOMEM
      This indicates that the record size on the tape  drive
      is more than the requested size during read operation.

Quote:> I just tried tcopy and it appears to have read across
> the 1608 boundary.

> 338652 is the size of a data scan... a complete c struct.

Now the fact that you read 1608 blocks of 338652 bytes
correctly, and then encounter a block that's larger
indicates you might have a bug in the writer program,
or in the tape driver of the RTOS system. You mention
below that it seems to be related to the RTOS patchlevels,
so a driver bug seems entirely plausible.

Quote:> The writing takes place on the RTOS, the code is not, as
> far as I can detect, changed.  It is running from a
> hard disk with different access conditions, but runs
> entirely from memory so should not be an issue.  I KNOW
> it has to be different, but I have not been able to detect
> the difference yet.  It appears to be related to the OS
> patchlevels internally and may be a bug of theirs,
> SCSI tape support in any RTOS is pretty thin
> and this tape drive is peculiarly ill-suited to
> its task.  We are actually  "breaking" it in
> terms of the original Quantum design, to
> get it to work at all.  Delicate.

I remember having a discussion with my management a long
time ago about the use of a tape to ensure the integrity
of data written to disk. They firmly believed tape was
more reliable than disk, so the system was designed not to
proceed until a block of transactions (it was a wagering
system) was written to tape, even though it was written
to two disk packs. Yuck.

Quote:> diff tells me that the app code for reading and writing
> is the same.  I have tried mods on the reader ( it uses
> the raw tape ), but haven't come up with one that works.

Try to read in slightly larger block sizes; the driver
should come back with the actual bytes read if the block
on tape is smaller than the read size. This might not
be possible with dd, but a custom read program might
show you that block 1609 is one or a few bytes larger.

Take care,

--
Stefaan
--
"What is stated clearly conceives easily."  -- Inspired sales droid

 
 
 

SCSI on Sol8 - read: "not enough space"

Post by Anthony Mandi » Sat, 05 Jul 2003 16:28:05



> Thanks... I had been working with the RTOS for so long I
> forgot that Solaris had tcopy.  It isn't a complete solution
> but it reads across the error points.  It HESITATES there, but
> it reads, throws no errors, and claims to have found the
> correctly sized 61440 block at the point where dd and my
> program fail.

        Does it only hesitate on the '03 tape? It sounds like
        its getting a soft error that's retryable. Something you
        mentioned in your original post about it being DLT and
        the way it did the writing reminded me of something.
        DLT works best when the data is streaming to it, otherwise
        it has to rewind and reseek to find its position. From
        your description, its doing this. I don't know why this
        would cause the problem though but I would look at the
        program that writes the '03 tapes to see if its writing
        the data fast enough to ensure that its streaming.

-am     ? 2003

 
 
 

SCSI on Sol8 - read: "not enough space"

Post by Stefaan A Eeckel » Sat, 05 Jul 2003 22:05:51


On Fri, 04 Jul 2003 17:28:05 +1000


>       I don't know why this
>    would cause the problem though but I would look at the
>    program that writes the '03 tapes to see if its writing
>    the data fast enough to ensure that its streaming.

Correct me if I'm wrong, but why would a problem
with the tape streaming on writing cause a problem on
reading? Not writing at optimal speed isn't an
error, so the tape should be fine, and read without
problems.

--
Stefaan
--
"What is stated clearly conceives easily."  -- Inspired sales droid

 
 
 

SCSI on Sol8 - read: "not enough space"

Post by Anthony Mandi » Sun, 06 Jul 2003 13:52:10



> Correct me if I'm wrong, but why would a problem
> with the tape streaming on writing cause a problem on
> reading? Not writing at optimal speed isn't an
> error, so the tape should be fine, and read without
> problems.

        Yes, I know. Ordinarily, it should not be an issue and
        a read shouldn't notice anything different. However, from
        his descriptions of the read "hesitating", it sounds like
        there might be a slight alignment error caused during the
        write. I think suggestions of trying a different tape drive
        might be an idea at this point. If anything, it would rule
        this out as an issue.

-am     ? 2003

 
 
 

SCSI on Sol8 - read: "not enough space"

Post by BJ Chippindal » Wed, 09 Jul 2003 07:17:32


Not exactly.  It reports 1608 x 61440.  It works out to about 290
of the full structs.  Since this corresponds( in terms of timing )
with the point at which the tape finally manages to empty its buffer
and stop, for a split second, before it restarts I am pretty
sure you are right.

The bad part is that while tcopy claimed to have worked, it
actually returned a loopback of some sort which contained a
repeated multi- ( about 50-60...I didn't count it, was
too annoyed ) 338652 struct.  I can tell because the contents
of the struct contain millisecond accurate time.... and it
steps back after this period.  The tcopy doesn't identify the
actual filemarks on the tape as its copying, they  are lost
and while it appears to be working it really is not.  There is
no error thrown.  Ugghh...

I am going back to the RTOS side.  It is slightly more
accessible to me and I might actually manage to "fix" it.  If
I can figure out how it can create this situation in the
first instance.  I will try the "slightly larger" trick to
get some idea perhaps, of the problem detail.   I haven't had
much success with this on the Solaris side at all.

Thanks for all your help.

respectfully
BJ


> On Thu, 03 Jul 2003 08:50:19 -0700

>>Using dd on the tape as follows:

>>dd ibs=338652 obs=512 if=/dev/nrAvTape1 of=test1

>>results in 1608 blocks read in followed by
>>the message
>>"read: not enough space"

> You mean you read 1608 * 338652 bytes and then it comes
> up with this message? As Casper says, the st(7D) driver
> will return an error when the block size on the tape is
> larger than the input buffer (from man st):

> ENOMEM
>       This indicates that the record size on the tape  drive
>       is more than the requested size during read operation.

>>I just tried tcopy and it appears to have read across
>>the 1608 boundary.

>>338652 is the size of a data scan... a complete c struct.

> Now the fact that you read 1608 blocks of 338652 bytes
> correctly, and then encounter a block that's larger
> indicates you might have a bug in the writer program,
> or in the tape driver of the RTOS system. You mention
> below that it seems to be related to the RTOS patchlevels,
> so a driver bug seems entirely plausible.

>>The writing takes place on the RTOS, the code is not, as
>>far as I can detect, changed.  It is running from a
>>hard disk with different access conditions, but runs
>>entirely from memory so should not be an issue.  I KNOW
>>it has to be different, but I have not been able to detect
>>the difference yet.  It appears to be related to the OS
>>patchlevels internally and may be a bug of theirs,
>>SCSI tape support in any RTOS is pretty thin
>>and this tape drive is peculiarly ill-suited to
>>its task.  We are actually  "breaking" it in
>>terms of the original Quantum design, to
>>get it to work at all.  Delicate.

> I remember having a discussion with my management a long
> time ago about the use of a tape to ensure the integrity
> of data written to disk. They firmly believed tape was
> more reliable than disk, so the system was designed not to
> proceed until a block of transactions (it was a wagering
> system) was written to tape, even though it was written
> to two disk packs. Yuck.

>>diff tells me that the app code for reading and writing
>>is the same.  I have tried mods on the reader ( it uses
>>the raw tape ), but haven't come up with one that works.

> Try to read in slightly larger block sizes; the driver
> should come back with the actual bytes read if the block
> on tape is smaller than the read size. This might not
> be possible with dd, but a custom read program might
> show you that block 1609 is one or a few bytes larger.

> Take care,

 
 
 

SCSI on Sol8 - read: "not enough space"

Post by BJ Chippindal » Wed, 09 Jul 2003 07:25:23


I have never tried tcopy on an '02 tape.  Not sure what it
might do as the regular tape reader works.  Good idea to
give it a go.

The data is a fixed rate 4.2 MB/sec and the tape streams
pretty well.  It doesn't rewind/reseek, but it DOES stop
momentarily as its raw speed of roughly 5 MB/Sec overcomes
the 4.2 coming in.  The stop causes the data to go to buffer
for a few moments as it restarts ( normally ) and causes no
problems.  This is the same behaviour in '03 and '02.  There
is no visible or perceptible difference in that writing
behaviour.  Whatever is wrong is pretty subtle.

The tcopy didn't actually work either.  It only claimed to
work, copied the tape data to the EOD marks and quit... except
that it didn't really.  It lost all the intervening filemarks
and on examination of the internal structure of the data I
discovered that the time started jumping backwards and the
data repeated over and over to get the right size.  This
was pretty bizarre.  I will have to try it again with some
additional instrumentation to try to determine if there's
something I can do about it.

Thanks
BJ



>>Thanks... I had been working with the RTOS for so long I
>>forgot that Solaris had tcopy.  It isn't a complete solution
>>but it reads across the error points.  It HESITATES there, but
>>it reads, throws no errors, and claims to have found the
>>correctly sized 61440 block at the point where dd and my
>>program fail.

>    Does it only hesitate on the '03 tape? It sounds like
>    its getting a soft error that's retryable. Something you
>    mentioned in your original post about it being DLT and
>    the way it did the writing reminded me of something.
>    DLT works best when the data is streaming to it, otherwise
>    it has to rewind and reseek to find its position. From
>    your description, its doing this. I don't know why this
>    would cause the problem though but I would look at the
>    program that writes the '03 tapes to see if its writing
>    the data fast enough to ensure that its streaming.

> -am        ? 2003

 
 
 

SCSI on Sol8 - read: "not enough space"

Post by Anthony Mandi » Wed, 09 Jul 2003 18:52:33



> I have never tried tcopy on an '02 tape.  Not sure what it
> might do as the regular tape reader works.  Good idea to
> give it a go.

        I was going to suggest using it as a comparison against an
        '03 tape (since they should be identical) but, in light of
        your other comments, looks as though the exercise would be
        pointless.

Quote:> The data is a fixed rate 4.2 MB/sec and the tape streams
> pretty well.  It doesn't rewind/reseek, but it DOES stop
> momentarily as its raw speed of roughly 5 MB/Sec overcomes
> the 4.2 coming in.  The stop causes the data to go to buffer
> for a few moments as it restarts ( normally ) and causes no
> problems.  This is the same behaviour in '03 and '02.  There
> is no visible or perceptible difference in that writing
> behaviour.  Whatever is wrong is pretty subtle.

        Yes, I was under the impression that DLTs had to reseek if
        the data wasn't streaming in quickly enough to keep up.

Quote:> The tcopy didn't actually work either.  It only claimed to
> work, copied the tape data to the EOD marks and quit... except
> that it didn't really.  It lost all the intervening filemarks
> and on examination of the internal structure of the data I
> discovered that the time started jumping backwards and the
> data repeated over and over to get the right size.  This
> was pretty bizarre.  I will have to try it again with some
> additional instrumentation to try to determine if there's
> something I can do about it.

        This would suggest that its a hardware related problem.
        Somehow or other, the writes for the '03 tapes are causing
        the tape drive to*up an interrecord gap. This would
        affect both tcopy and dd and may explain why mt isn't affected.
        Its just looking for an EOF mark when you use fsf. But, if you
        use fsr to advance thru records it would look at the gap and
        may experience the same problem. Can you try that?

        One other question, are you using a different tape drive to
        read the tapes on Solaris? I expect you are and the one you
        are using to read with may be more "sensitive" to the size
        of the gap.

-am     ? 2003

 
 
 

1. ??: "xpm: Not enough colors." In "xpat2" Card Game???

I recently installed a solitaire game that was on my SlackWare 3.1
Linux distribution ... my first "extra" game!  When I try to run it I
get the "xpm: Not enough colors" error message.  Does anyone have any
ideas to "simplify" the game to not require this.

I'm running Linux V2.0.30, and fvwm95 in 800x600 bpp8 mode.  I don't
have enough RAM to run bpp16 or whatever.  I also updated the
"/usr/i486/linuxaout/lib/libxpm.so.4.?" from 3 to 5 as was required.

Any insight would be appreciated! Thanks in advance for any help.

   --------------------------------------
     Doug Mitton
      * In Brockville, Ontario, Canada
        (City of the Thousand Islands!)

      http://mulberry.com/~dmitton
   --------------------------------------

2. Missing autoloader medium changer kernel device

3. ??: "xpm: Not enough colors" In "xpat2" Card Game?

4. Is Solaris compatible with Linux?

5. "ld: fatal: file /dev/zero: cannot mmap file: Not enough space"... but why ?

6. HP Netserver LPr

7. mksysb restore "not enough disk space selected"

8. PPP Kernal Compile

9. HELP -- fork: "not enough space"

10. "not enough drive space" error.

11. vi "not enough space" error

12. tape recover error "not enough space"

13. "not enough space in file system" message