beginner help with multiline records

beginner help with multiline records

Post by R Brit » Tue, 24 Jul 2001 06:43:27



I have thousands of multiline records like these:

JANICE UNDERWOOD
626 MURCHISON ROAD
FAYETTEVILLE NC 28303
DEBORAH M TYSON
RT 8 BOX 50348
ST. PAULS NC 28365
NICHOLAS J WALLACE
2010 MONAGON ST
FAYETTEVILLE NC 28302

I need to convert each 3-line record to a 1-line record with
tabs like this:
name[TAB]address[TAB]city state[TAB]ZIP
Tom Lee[TAB]101 Blue Rd[TAB]Cary NC[TAB]28456

Getting the tab before the ZIP has been tough. All the ZIPs
have been stripped down to 5 numerals.

The only way I can figure it is (using GNU awk 3.0.6):

awk '{print \

Quote:> (NR % 3 == 0 ? \
> substr($0,1,length($0)-5)"\t" \
> substr($0,length($0)-4,5) : \
> $0)}' datafile | \
> paste - - -

Beginner question: any simpler way to do this, especially
the tab before the ZIP code?

Roge

 
 
 

beginner help with multiline records

Post by nos.. » Tue, 24 Jul 2001 07:18:06



> I have thousands of multiline records like these:

> JANICE UNDERWOOD
> 626 MURCHISON ROAD
> FAYETTEVILLE NC 28303
> DEBORAH M TYSON
> RT 8 BOX 50348
> ST. PAULS NC 28365
> NICHOLAS J WALLACE
> 2010 MONAGON ST
> FAYETTEVILLE NC 28302

> I need to convert each 3-line record to a 1-line record with
> tabs like this:
> name[TAB]address[TAB]city state[TAB]ZIP
> Tom Lee[TAB]101 Blue Rd[TAB]Cary NC[TAB]28456

One possibility, assuming the data file consists of a number of lines
which is a multiple of three (ie, no blank lines, etc):

#!/bin/ksh
while read name;do
  read addr1
  read addr2
  echo -e "$name\t$addr1\t$addr2"
done < data

 
 
 

beginner help with multiline records

Post by Chris F.A. Johnso » Tue, 24 Jul 2001 08:27:48



> I have thousands of multiline records like these:

> JANICE UNDERWOOD
> 626 MURCHISON ROAD
> FAYETTEVILLE NC 28303
> DEBORAH M TYSON
> RT 8 BOX 50348
> ST. PAULS NC 28365
> NICHOLAS J WALLACE
> 2010 MONAGON ST
> FAYETTEVILLE NC 28302

> I need to convert each 3-line record to a 1-line record with
> tabs like this:
> name[TAB]address[TAB]city state[TAB]ZIP
> Tom Lee[TAB]101 Blue Rd[TAB]Cary NC[TAB]28456

> Getting the tab before the ZIP has been tough. All the ZIPs
> have been stripped down to 5 numerals.

> The only way I can figure it is (using GNU awk 3.0.6):

> awk '{print \
> > (NR % 3 == 0 ? \
> > substr($0,1,length($0)-5)"\t" \
> > substr($0,length($0)-4,5) : \
> > $0)}' datafile | \
> > paste - - -

> Beginner question: any simpler way to do this, especially
> the tab before the ZIP code?

awk '
NR % 3 { printf "%s\t", $0 }
NR %3 == 0 {sub(" " $NF, "\t" $NF)

Quote:> print}' < FILENAME ## or command | awk '.......

--
    Chris F.A. Johnson                        http://cfaj.freeshell.org
    ===================================================================
    My code (if any) in this post is copyright 2001, Chris F.A. Johnson
    and may be copied under the terms of the GNU General Public License
 
 
 

beginner help with multiline records

Post by Charles Dem » Tue, 24 Jul 2001 09:15:26




Quote:>I have thousands of multiline records like these:

>JANICE UNDERWOOD
>626 MURCHISON ROAD
>FAYETTEVILLE NC 28303
>DEBORAH M TYSON
>RT 8 BOX 50348
>ST. PAULS NC 28365
>NICHOLAS J WALLACE
>2010 MONAGON ST
>FAYETTEVILLE NC 28302

>I need to convert each 3-line record to a 1-line record with
>tabs like this:
>name[TAB]address[TAB]city state[TAB]ZIP
>Tom Lee[TAB]101 Blue Rd[TAB]Cary NC[TAB]28456

>Getting the tab before the ZIP has been tough. All the ZIPs
>have been stripped down to 5 numerals.

>The only way I can figure it is (using GNU awk 3.0.6):

>awk '{print \
>> (NR % 3 == 0 ? \
>> substr($0,1,length($0)-5)"\t" \
>> substr($0,length($0)-4,5) : \
>> $0)}' datafile | \
>> paste - - -

>Beginner question: any simpler way to do this, especially
>the tab before the ZIP code?

awk 'NR%3!=0 {a=a $0 "\t" }
     NR%3==0 {print a substr($0,1,length-5) "\t" $NF; a=""}' infile

or

awk '{printf("%s\t", $0)} NR%3==0 {print "" }' infile |
sed 's/ *\(.....\)\(.\)$/\2\1/'

or

awk '{a=$0 "\t"; b= substr($0,1,length-5) "\t" $NF "\n";
      printf("%s", NR%3!=0 ? a : b) }' infile

or

awk '{a=$0 "\t"; zip=$NF; sub(/ *.....$/, ""); b=$0 "\t" zip "\n";
      printf("%s", NR%3!=0 ? a : b) }' infile

or other variations

Chuck Demas

--
  Eat Healthy    |   _ _   | Nothing would be done at all,

  Die Anyway     |    v    | That no one could find fault with it.

 
 
 

beginner help with multiline records

Post by Chris F.A. Johnso » Tue, 24 Jul 2001 11:26:56




> > I have thousands of multiline records like these:

> > JANICE UNDERWOOD
> > 626 MURCHISON ROAD
> > FAYETTEVILLE NC 28303
> > DEBORAH M TYSON
> > RT 8 BOX 50348
> > ST. PAULS NC 28365
> > NICHOLAS J WALLACE
> > 2010 MONAGON ST
> > FAYETTEVILLE NC 28302

> > I need to convert each 3-line record to a 1-line record with
> > tabs like this:
> > name[TAB]address[TAB]city state[TAB]ZIP
> > Tom Lee[TAB]101 Blue Rd[TAB]Cary NC[TAB]28456

> > Getting the tab before the ZIP has been tough. All the ZIPs
> > have been stripped down to 5 numerals.

> > The only way I can figure it is (using GNU awk 3.0.6):

> > awk '{print \
> > > (NR % 3 == 0 ? \
> > > substr($0,1,length($0)-5)"\t" \
> > > substr($0,length($0)-4,5) : \
> > > $0)}' datafile | \
> > > paste - - -

> > Beginner question: any simpler way to do this, especially
> > the tab before the ZIP code?

> awk '
> NR % 3 { printf "%s\t", $0 }
> NR %3 == 0 {sub(" " $NF, "\t" $NF)
> > print}' < FILENAME ## or command | awk '.......

s/> print/print/

--
    Chris F.A. Johnson                        http://cfaj.freeshell.org
    ===================================================================
    My code (if any) in this post is copyright 2001, Chris F.A. Johnson
    and may be copied under the terms of the GNU General Public License