reading files line-be-line?

reading files line-be-line?

Post by Albert » Tue, 25 Mar 2003 06:38:10



Is there a unix utility that will return a file line-by-line? I'm trying
to do a loop like:

for somevariable in `cat somefile`
do
    #do something with $somevariable
done

but that loop would iterate once for each white-space-delimited token in
"somefile". I want it to iterate once for each line of somefile. What
could I replace the cat utility with?

Thanks!
A.J.

 
 
 

reading files line-be-line?

Post by Lurc » Tue, 25 Mar 2003 06:47:02



> Is there a unix utility that will return a file line-by-line? I'm trying
> to do a loop like:

> for somevariable in `cat somefile`
> do
>     #do something with $somevariable
> done

while read LINE; do
        echo "--${LINE}--"
done < file

or:

cat file | while read LINE; do
        echo "--${LINE}--"
done

HTH,
Lurch_          

 
 
 

reading files line-be-line?

Post by David Mean » Tue, 25 Mar 2003 13:43:59




>> Is there a unix utility that will return a file line-by-line? I'm
>> trying to do a loop like:

>> for somevariable in `cat somefile`
>> do
>>     #do something with $somevariable
>> done

> while read LINE; do
>         echo "--${LINE}--"
> done < file

> or:

> cat file | while read LINE; do
>         echo "--${LINE}--"
> done

Oooh, you're brave to post that example.  I'll be surprised if "useless
uses of cat gods' don't strike you down for that one!  :-)

Quote:> HTH,
> Lurch_

--
David Means

Never frobnicate without first grokking.

 
 
 

reading files line-be-line?

Post by Barry Kimelma » Tue, 25 Mar 2003 23:25:48


[This followup was posted to comp.unix.shell]



Quote:> Is there a unix utility that will return a file line-by-line? I'm trying
> to do a loop like:

> for somevariable in `cat somefile`
> do
>     #do something with $somevariable
> done

> but that loop would iterate once for each white-space-delimited token in
> "somefile". I want it to iterate once for each line of somefile. What
> could I replace the cat utility with?

> Thanks!
> A.J.

while read somevariable
do
    # do something
done <somefile

--
---------

Barry Kimelman
Winnipeg, Manitoba, Canada

 
 
 

reading files line-be-line?

Post by Corné Beers » Wed, 26 Mar 2003 02:06:50





> >> Is there a unix utility that will return a file line-by-line? I'm
> >> trying to do a loop like:

> >> for somevariable in `cat somefile`
> >> do
> >>     #do something with $somevariable
> >> done

> > while read LINE; do
> >         echo "--${LINE}--"
> > done < file

> > or:

> > cat file | while read LINE; do
> >         echo "--${LINE}--"
> > done

> Oooh, you're brave to post that example.  I'll be surprised if "useless
> uses of cat gods' don't strike you down for that one!  :-)

There should be no problem, it clearly shows the useless use of cat.

CBee

 
 
 

reading files line-be-line?

Post by Stephane CHAZELA » Wed, 26 Mar 2003 03:41:44


[...]

Quote:>> Oooh, you're brave to post that example.  I'll be surprised if "useless
>> uses of cat gods' don't strike you down for that one!  :-)

> There should be no problem, it clearly shows the useless use of cat.

Anyway while-read loops are BUOS (bad use of shell), so nobody
would notice the UUOC.

--
Stphane

 
 
 

reading files line-be-line?

Post by William Par » Wed, 26 Mar 2003 04:42:09




> [...]
>>> Oooh, you're brave to post that example.  I'll be surprised if "useless
>>> uses of cat gods' don't strike you down for that one!  :-)

>> There should be no problem, it clearly shows the useless use of cat.

> Anyway while-read loops are BUOS (bad use of shell), so nobody
> would notice the UUOC.

How else can you read <stdin> ?

--

Linux solution for data management and processing.

 
 
 

reading files line-be-line?

Post by Stephane CHAZELA » Wed, 26 Mar 2003 05:30:51


[...]

Quote:> How else can you read <stdin> ?

Almost every text utility (read is one of them) can read its
standard output. "read" only reads one line at a time one
character at a time and performs further processing on it,
that's three good reasons for not using it in a shell script for
text processing.

Use and pipe utilities that read _all_ the input by chunks.

Running one (or several) utilities per line of input is really
bad practice.

--
Stphane

 
 
 

reading files line-be-line?

Post by William Par » Thu, 27 Mar 2003 04:52:45




> [...]
>> How else can you read <stdin> ?

> Almost every text utility (read is one of them) can read its
> standard output. "read" only reads one line at a time one
> character at a time and performs further processing on it,
> that's three good reasons for not using it in a shell script for
> text processing.

> Use and pipe utilities that read _all_ the input by chunks.

> Running one (or several) utilities per line of input is really
> bad practice.

Well, reading character by character is not efficient.  But, how would
you re-write

    while read a b c; do
        ...
    done < file

using other utilities?

--

Linux solution for data management and processing.

 
 
 

reading files line-be-line?

Post by Greg Andre » Thu, 27 Mar 2003 07:17:58




>[...]
>> How else can you read <stdin> ?

>Almost every text utility (read is one of them) can read its
>standard output.

You mean its standard INput, don't you?

Quote:

>"read" only reads one line at a time one character at a time and

You have a good point when you discourage the practice of
executing an external command for every line of a file,
but the above just doesn't seem right.

Are you saying the shell doesn't use buffered file reads,
and the OS' VM system doesn't perform efficient large reads
from disk?

Tracing the /bin/sh on my Solaris 8 machine indicates otherwise.

  -Greg
--

     I have a map of the United States that's actual size.
                                -- Steven Wright

 
 
 

reading files line-be-line?

Post by Stephane CHAZELA » Thu, 27 Mar 2003 16:45:06


[...]

Quote:> Well, reading character by character is not efficient.  But, how would
> you re-write

>     while read a b c; do
>    ...
>     done < file

> using other utilities?

awk '...' < file

Or:

utility1 | ... | utilityn

You may find the reading of this book useful:
http://www.catb.org/~esr/writings/taoup/html/

--
Stphane

 
 
 

reading files line-be-line?

Post by Stephane CHAZELA » Thu, 27 Mar 2003 16:45:10


[...]

Quote:>>Almost every text utility (read is one of them) can read its
>>standard output.

> You mean its standard INput, don't you?

Yes, typo, sorry.

Quote:>>"read" only reads one line at a time one character at a time and

> You have a good point when you discourage the practice of
> executing an external command for every line of a file,
> but the above just doesn't seem right.

> Are you saying the shell doesn't use buffered file reads,
> and the OS' VM system doesn't perform efficient large reads
> from disk?

> Tracing the /bin/sh on my Solaris 8 machine indicates otherwise.

I say that read(1) reads one character at a time (one read(2)
system call per character). That doesn't mean that only one
character is read from disk when input is on a hard drive.

read does this because it has to read only one line, so it must
ensure it doesn't eat any additionnal byte once it has read the
trailing '\n'. The shell can't buffer it's input because
external utilities wouldn't have access to this "buffer".

~$ strace -e read sh -c 'read < a'
[...]
read(0, "#", 1)                         = 1
read(0, "!", 1)                         = 1
read(0, " ", 1)                         = 1
read(0, "/", 1)                         = 1
read(0, "b", 1)                         = 1
read(0, "i", 1)                         = 1
read(0, "n", 1)                         = 1
read(0, "/", 1)                         = 1
read(0, "s", 1)                         = 1
read(0, "h", 1)                         = 1
read(0, "\n", 1)                        = 1

--
Stphane

 
 
 

reading files line-be-line?

Post by Corné Beers » Sat, 29 Mar 2003 00:08:41




> [...]
> > How else can you read <stdin> ?

> Almost every text utility (read is one of them) can read its
> standard output. "read" only reads one line at a time one
> character at a time and performs further processing on it,
> that's three good reasons for not using it in a shell script for
> text processing.

You don't answer the question: How do you read <stdin>?
You will need it if the script will be started by inetd and such.

Quote:

> Use and pipe utilities that read _all_ the input by chunks.

I kind of like chuncks in record size, preferably subdivided into
fields. And with blank or spaces as field separator and EOL as record
separator, read does a fairly good job...

Quote:

> Running one (or several) utilities per line of input is really
> bad practice.

Some tools need to be run for every line. I frequently work with a list
of hostnames to do something on each host. Hard to avoid to run the
utility on each line...

I just like to say: Yes, `cat` can be regarded as useless however, it
adds some readability. I cannot follow you on the bad-use-of-shell...
(it's the first time I read about it)

CBee

 
 
 

reading files line-be-line?

Post by Stephane CHAZELA » Sat, 29 Mar 2003 00:51:48


[...]

Quote:> You don't answer the question: How do you read <stdin>?
> You will need it if the script will be started by inetd and such.

head -10
reads 10 lines (and maybe more as it may read by buffers) from
stdin, whatever stdin is.

c=0
while [ $c -lt 10 ]; do
  read line
  echo "$line"
  c=`expr $c + 1`
done

does the same except it may fork 30 processes and reads the
input one byte at a time, and strips leading and trailing blanks
and processes \ sequences (both in read and echo), may fail on
lines like "-n"...

That's what I call a BUOS.

Samely:
awk '{print NR ": " $2}'

is better than:

c=1
while read line; do
  echo "$c: $line"
  c=`expr $c + 1`
done

Quote:>> Use and pipe utilities that read _all_ the input by chunks.

> I kind of like chuncks in record size, preferably subdivided into
> fields. And with blank or spaces as field separator and EOL as record
> separator, read does a fairly good job...

awk does the same more reliably and more efficiently.

Quote:>> Running one (or several) utilities per line of input is really
>> bad practice.

> Some tools need to be run for every line. I frequently work with a list
> of hostnames to do something on each host. Hard to avoid to run the
> utility on each line...

[...]

Yes, that's why I say while-read shell loops are /generally/ bad
practice. In some cases, it's useful.

If it's for text processing, I consider it BUOS, as there are
tools to perform text processing, and a shell is a tool designed
to run tools more than a programming language.

--
Stphane