Wanted: rcmd host tar x_?_vqf file.tar file1 ... fileN | tar xvf -

Wanted: rcmd host tar x_?_vqf file.tar file1 ... fileN | tar xvf -

Post by Radek Tomi » Thu, 04 Dec 1997 04:00:00



[X-posted to c.u.sco.misc & c.sys.sun.misc, no Followup-To set, I'll
check'em both]

Situation:

SCO v4.2 box as digital voice recorder, archiving VOX-files to Network
Jukebox Controller (basically HP MO Jukebox with 16 slots) running SunOS
4.1.3 sparc (sun4c). NJC is used for general backup/archive for whole
network. Size
of each archive tar-file is 300 MB, containing about 2500 VOX-files. So
far so good. But when it comes to user requesting old calls to be
replayed, he/she must wait a bit. At the moment, the restoring process
(say, of 100 VOX-files) is done with the following:

rcmd NJC cat archiveN.tar | tar xvqf - ${files_to_restore}

which is CPU time and network bandwidth wasting, if the files are not
stored at the beginning of archive ('tar ..q' - quits immediately when
all files are extracted). On average, the NJC must read 150 MB from MO or
cache and send it to the SCO box over network, even if only one file/call
is requested. Network and SCO box are very busy, so it may even take about
half an hour. It isn't really bad (users seldom complain about it), but it
could be much faster.

I'm about to work around it with something like this:

rcmd NJC "(
  taroffs archive.tar ${files_to_extract} | while read offset nbytes
  do
    dd < archive.tar iseek=$((offset/512)) count=$((nbytes/512))
    # OK, If SunOS 4.1.3's 'dd' does not support
    # 'iseek' or smthg like that, I will use my tool:
    # fildes archive.tar,${offset} : dd count=$((nbytes/512))
  done
)" | tar xvf -

Yes, this needs utility `taroffs`, which would print out offsets and sizes
for the specified files in the given archive. No problem, but:

The question is:

before I will code it (`taroffs`), I'd like to know, whether anyone knows
of some tar-like app, which has the option, which is missing in the
${subject}. Instead of restoring the files to disk, it should extract the
files to std. output as a smaller tar-archive. Note that it must support
also the 'q' option mentioned above and of course it must be able to
*lseek()* (in the archive) over files (not *read()* them), which are not
of interest. Otherwise, such option would be useless in this situation.

I'm looking either for SunOS 4.1.3 sparc binary or sources, which I would
be
able to compile on SunOS 4.1.3 (sun4c) or 5.[34] (sun4m/sparc).

Any other solution ?
No, NFS isn't part of SCO v4.2 Open Desktop Lite 3.0.0.

Thanks for reading this far and for your time.

--
Radek Tomis

 
 
 

Wanted: rcmd host tar x_?_vqf file.tar file1 ... fileN | tar xvf -

Post by Bela Lubki » Thu, 04 Dec 1997 04:00:00



> Situation:

> SCO v4.2 box as digital voice recorder, archiving VOX-files to Network
> Jukebox Controller (basically HP MO Jukebox with 16 slots) running SunOS
> 4.1.3 sparc (sun4c). NJC is used for general backup/archive for whole
> network. Size
> of each archive tar-file is 300 MB, containing about 2500 VOX-files. So
> far so good. But when it comes to user requesting old calls to be
> replayed, he/she must wait a bit. At the moment, the restoring process
> (say, of 100 VOX-files) is done with the following:

> rcmd NJC cat archiveN.tar | tar xvqf - ${files_to_restore}

> which is CPU time and network bandwidth wasting, if the files are not
> stored at the beginning of archive ('tar ..q' - quits immediately when
> all files are extracted). On average, the NJC must read 150 MB from MO or
> cache and send it to the SCO box over network, even if only one file/call
> is requested. Network and SCO box are very busy, so it may even take about
> half an hour. It isn't really bad (users seldom complain about it), but it
> could be much faster.

The NJC apparently supports a filesystem on the MO cartridges.  Why
don't you use it?  Instead of storing huge tarchives with 2500 files in
them, store individual files.  Objection: that will cost a lot more
transactions across the net, possibly more wasted space on the MO
cartridges (if its block size is larger than tar's 512-bytes).  Ok, so
keep doing what you're doing, but use smaller archives.  Store 250 VOX
files in a 30MB archive.  The added network overhead and wasted space
won't be noticable.  Extracting an old file will average 10x faster.

Quote:> I'm about to work around it with something like this:

> rcmd NJC "(
>   taroffs archive.tar ${files_to_extract} | while read offset nbytes
>   do
>     dd < archive.tar iseek=$((offset/512)) count=$((nbytes/512))
>     # OK, If SunOS 4.1.3's 'dd' does not support
>     # 'iseek' or smthg like that, I will use my tool:
>     # fildes archive.tar,${offset} : dd count=$((nbytes/512))
>   done
> )" | tar xvf -

This sequence shows that you think you can run reasonably sophisticated
Unix commands on the NJC.  Why don't you just extract the
files_to_extract *there*, then repackage them and ship only the desired
files across the net?  This will only work if you have some local
storage on the NJC (presumably you don't want to store the temporary
extracted files on the MO cartridges).  If necessary, add a local hard
disk ("small" 1GB drive ;-)

SCO's tar(C) has an "n" flag that tells it to seek from one tar header
to the next, instead of reading the intervening data.  The input must be
seekable, of course (won't work in a pipeline `rcmd NJC cat tarfile |
tar xfn - files`).  Check whether the SunOS 4.1.3 `tar` has a flag to
the same effect.  Or you could add it to GNU tar, if it doesn't already
do it automatically.

Quote:> Yes, this needs utility `taroffs`, which would print out offsets and sizes
> for the specified files in the given archive. No problem, but:

> The question is:

> before I will code it (`taroffs`), I'd like to know, whether anyone knows
> of some tar-like app, which has the option, which is missing in the
> ${subject}. Instead of restoring the files to disk, it should extract the
> files to std. output as a smaller tar-archive. Note that it must support
> also the 'q' option mentioned above and of course it must be able to
> *lseek()* (in the archive) over files (not *read()* them), which are not
> of interest. Otherwise, such option would be useless in this situation.

tar format is quite simple; an extract-as-tar program would be easy to
cobble together, if the existing tools don't already do it.  The
extract-as-tar program will be just as easy as `taroffs` that you
propose, and more suitable to task.

Either write your own, or start with GNU tar or `pax` (etc., there are
many tar programs out there).  tar format is documented in tar(F) on
your ODT Lite box.  I think it is sufficient to do:

  open inputfile
  repeat
    read a tar header from inputfile
    if it's one of the files we're looking for,
      write the header to stdout
      loop reading blocks from inputfile, writing to stdout,
      for the size given in the header
    else
      lseek inputfile past the data of this unwanted file
  until find an all-0s tar header (end-of-tarchive)
  write that to stdout

Quote:>Bela<

--
Sandy and Bela Lubkin are traveling around the world for a year!  Now posting
from Watford UK; touring UK next.        +Please do not Cc: me on news posts!
Our stories and pictures are at: http://www.armory.com/~alexia/trip/trip.html

 
 
 

Wanted: rcmd host tar x_?_vqf file.tar file1 ... fileN | tar xvf -

Post by Radek Tomi » Fri, 05 Dec 1997 04:00:00




> > Situation:

> > SCO v4.2 box as digital voice recorder, archiving VOX-files to
Network
> > Jukebox Controller (basically HP MO Jukebox with 16 slots)

running
SunOS
Quote:> > 4.1.3 sparc (sun4c). NJC is used for general backup/archive for
whole
> > network. Size
> > of each archive tar-file is 300 MB, containing about 2500

VOX-files.
So
Quote:> > far so good. But when it comes to user requesting old calls to be
> > replayed, he/she must wait a bit. At the moment, the restoring
process
> > (say, of 100 VOX-files) is done with the following:

> > rcmd NJC cat archiveN.tar | tar xvqf - ${files_to_restore}

> > which is CPU time and network bandwidth wasting, if the files are
not
> > stored at the beginning of archive ('tar ..q' - quits immediately
when
> > all files are extracted). On average, the NJC must read 150 MB

from MO
or
Quote:> > cache and send it to the SCO box over network, even if only one
file/call
> > is requested. Network and SCO box are very busy, so it may even

take
about
Quote:> > half an hour. It isn't really bad (users seldom complain about

it),
but it

Quote:> > could be much faster.

> The NJC apparently supports a filesystem on the MO cartridges.  Why
> don't you use it?  Instead of storing huge tarchives with 2500
files in
> them, store individual files.  Objection: that will cost a lot more

Right, there are virtual FSes made up of MO cartridges for different
archives ("/vox", "/informix", etc.). At the beginning, we were
storing
individual files, till we get NJC software to its knees. It couldn't
really handle so many files within virtual FS (not within one
directory,
as files are hierarchically structured into subdirectories:
VOX-ID=1234567
=> VOX-FNAME="1/234/567"). You know, all that index stuff. It was too
slow
(access, reindexing in case of failure, ...) and if memory serves me
well,
we were having even problems to get some files off the NJC. We
suspected
the NJC SW isn't well-written and were trying to complain about it to
reseller, but of no avail. These days, it would {{have|has}|have
had}? (my
czenglish.. :-) to handle 750,000 files as we've reached number of 3
hundreds 300 MB archives.

Quote:> transactions across the net, possibly more wasted space on the MO
> cartridges (if its block size is larger than tar's 512-bytes).  Ok,
so
> keep doing what you're doing, but use smaller archives.  Store 250
VOX
> files in a 30MB archive.  The added network overhead and wasted
space
> won't be noticable.  Extracting an old file will average 10x

faster.

Right, that would be the easiest solution. In addition, archive size
(now
300 MB) is configurable item, so it would be only a matter of
cfg-file's
change. But what about all those 300 archives, they would still sit
here,
expiring sometimes in the future, of course. But above all, 30 MB
files
would probably bring too-many-files problem with NJC again in the
future:

5 (years of archive's lifetime) * 800 MB (avg VOX-data/day) =
1,460,000 MB

Ending up with about 47,000 files. I doubt (after experiences
mentioned
above) this NJC would handle such pile.

:-O
Can you imagine 548x 2.6 GB MO cartridges ?
I cannot, but who cares..

Quote:> > I'm about to work around it with something like this:

> > rcmd NJC "(
> >   taroffs archive.tar ${files_to_extract} | while read offset
nbytes
> >   do
> >     dd < archive.tar iseek=$((offset/512)) count=$((nbytes/512))
> >     # OK, If SunOS 4.1.3's 'dd' does not support
> >     # 'iseek' or smthg like that, I will use my tool:
> >     # fildes archive.tar,${offset} : dd count=$((nbytes/512))
> >   done
> > )" | tar xvf -

> This sequence shows that you think you can run reasonably
sophisticated
> Unix commands on the NJC.  Why don't you just extract the
> files_to_extract *there*, then repackage them and ship only the
desired
> files across the net?  This will only work if you have some local
> storage on the NJC (presumably you don't want to store the
temporary
> extracted files on the MO cartridges).  If necessary, add a local
hard
> disk ("small" 1GB drive ;-)

There is about 1 GB HDD space for NJC read/write cache. I think I
would be
able to use some part of it. However, in case user doesn't know exact
date/time, he/she may want to restore several hours yielding in, say,
100
MB. In case there would be less room on NJC than on SCO, I would have
to
split the process into several smaller chunks (extract 10 MB, sent,
remove, extract ... and so on). That wouldn't be straightforward.
Furthermore, comparing to extract-as-tar feature, in addition it
would
have to write 100 MB to NJC's HDD (tar xv archive files) and then
read
those 100 MB from NJC's HDD to sent them over net (tar cvf - files |
..),
yielding of additional 200 MB I/O :(

Maybe we could add another 1 GB HDD to smooth the process, but SW
solution
is much cheaper than HW here ;(

Quote:> SCO's tar(C) has an "n" flag that tells it to seek from one tar
header
> to the next, instead of reading the intervening data.  The input

must be

I've thought that SCO's tar does it automatically on plain files like
on
floppy diskettes (tar tv6, /etc/default/tape: tape-column = n), but
actually, it does not :-I

Quote:> seekable, of course (won't work in a pipeline `rcmd NJC cat tarfile
|
> tar xfn - files`).  Check whether the SunOS 4.1.3 `tar` has a flag
to
> the same effect.  Or you could add it to GNU tar, if it doesn't
already
> do it automatically.

> > Yes, this needs utility `taroffs`, which would print out offsets

and
sizes
Quote:> > for the specified files in the given archive. No problem, but:

> > The question is:

> > before I will code it (`taroffs`), I'd like to know, whether

anyone
knows
Quote:> > of some tar-like app, which has the option, which is missing in
the
> > ${subject}. Instead of restoring the files to disk, it should

extract
the
Quote:> > files to std. output as a smaller tar-archive. Note that it must
support
> > also the 'q' option mentioned above and of course it must be able
to
> > *lseek()* (in the archive) over files (not *read()* them), which

are
not

- Show quoted text -

Quote:> > of interest. Otherwise, such option would be useless in this
situation.

> tar format is quite simple; an extract-as-tar program would be easy
to
> cobble together, if the existing tools don't already do it.  The
> extract-as-tar program will be just as easy as `taroffs` that you
> propose, and more suitable to task.

> Either write your own, or start with GNU tar or `pax` (etc., there
are
> many tar programs out there).  tar format is documented in tar(F)
on
> your ODT Lite box.  I think it is sufficient to do:

>   open inputfile
>   repeat
>     read a tar header from inputfile
>     if it's one of the files we're looking for,
>       write the header to stdout
>       loop reading blocks from inputfile, writing to stdout,
>       for the size given in the header
>     else
>       lseek inputfile past the data of this unwanted file
>   until find an all-0s tar header (end-of-tarchive)
>   write that to stdout

Exactly what I've been thinking about. I will merge both
extract-as-tar
and `taroffs` functionality into one utility. It may be useful in
other
situations too. Or maybe I will try to look at GNU tar and
incorporate
these features into it, if it's not already there.

Quote:> >Bela<

Thanks Bela for your reply and suggestions.

--
Radek Tomis

 
 
 

Wanted: rcmd host tar x_?_vqf file.tar file1 ... fileN | tar xvf -

Post by Bela Lubki » Mon, 08 Dec 1997 04:00:00



> X-Newsreader: Microsoft Internet News 4.70.1162

I can tell by the way your text is mis-wrapped.  Can you slap it around
a bit?

Quote:> > SCO's tar(C) has an "n" flag that tells it to seek from one tar
> header
> > to the next, instead of reading the intervening data.  The input
> must be

> I've thought that SCO's tar does it automatically on plain files like
> on
> floppy diskettes (tar tv6, /etc/default/tape: tape-column = n), but
> actually, it does not :-I

It does seeks (1) if the "n" flag is given, and (2) if you are using one
of the numeric archives from /etc/default/tar, and its entry has an "n"
in the tape column.  You would expect (3) that if stat(S) claims it's a
normal file (or block device), it would also use seeks.  I don't know
why it doesn't.  Perhaps because the code that makes the decision was
written in 1982 and 1984...

Quote:>Bela<

--
Sandy and Bela Lubkin are traveling around the world for a year!  Now posting
from Leicester UK; Edinburgh next.       +Please do not Cc: me on news posts!
Our stories and pictures are at: http://www.armory.com/~alexia/trip/trip.html
 
 
 

1. tar -xvf mozilla.tar = tar: directory checksum error

hi

I have downloaded the mozilla source twice from their web site. I
download it at school using RH 7.1 & 6.2 and then I copyed it onto my
zip disk and carried it home and ftp it from my RH 7.1 machine to
solaris. both times and also once when I made a tar of my own I get a
directory checksum error. I am very sure the files are the correct size.

I see that I have options to use -i ignore with the solaris tar program
and I use star and it extracts without error.

Does anyone else consistently get errors with solaris's tar program and
large file's?. I don't get any error's when I gunzip or bunzip2 them.

It seems funny- I have always used gzip and bzip to sort of protect my
tar files from checksum errors I always expect to get the error with
them first.

Is it the mozilla download?  or is it the diffrence between- every one
else using a diffrent version of tar like gnu? or can anyone see
anything I am doing wrong?

I never get errors with small files < 20 meg that I can remember, but
the last couple mozilla and the one I made were a little over 200meg
uncompressed.

THis isn't really a big problem because star and -i work but I am still
very concerned about the stability of my files.

Thanks
dm

2. ODBC Driver for MS SQL Server?

3. ? tar xvf my.tar | compress

4. ML_08 FTP Errors

5. Re-Re- ? tar xvf my.tar | c

6. Possible kernel bug?

7. rsh, rcp, GNU tar cvf user@host:./tarfile.tar file FAIL

8. how to determine if a file is a symbolic link (in C programs)

9. tar xzf file or tar -xzf file?

10. Scripting Help: tar a dir with time and date as file tar file name..................TIA

11. backup multiple tar-files on a tape using tar and mt - command

12. Extracting files with tar and uncompress fails with tar: Archive - EOF not on block boundary

13. How does one compres files in this .tar.gz .Z and .tar.Z