Unix - Matching data string from 2 files

Unix - Matching data string from 2 files

Post by base6 » Sun, 25 Jun 2006 08:23:27




> Help Please..

Either do your own homework or drop the class.
Quote:

> I have file#1 - Here is what's in it:

> "00024","111111111A","111111111","DOE","JOHN","RECORD ERROR BLAH"
> "00024","222222222A","222222222","DOE","JANE","RECORD ERROR BLAH"
> "00024","333333333A","333333333","DOE","DICK","RECORD ERROR BLAH"
> "00024","444444444A","444444444","DOE","RON","RECORD ERROR BLAH"

> I have file#2 - Here is what's in it:

> 00024,111111111A,111111111,DOE,JOHN,COMM ERROR
> 00024,222222222A,222222222,DOE,JANE,COMM ERROR
> 00024,333333333A,333333333,DOE,DICK,COMM ERROR
> 00024,444444444A,444444444,DOE,RON,COMM ERROR

> I need to match up these the records in file#1 using the 2nd, 4th and
> 5th delimited information with file#2 and write those results out to
> a separate file.

 
 
 

Unix - Matching data string from 2 files

Post by John » Sun, 25 Jun 2006 16:18:24



> Help Please..

> I have file#1 - Here is what's in it:

> "00024","111111111A","111111111","DOE","JOHN","RECORD ERROR BLAH"
> "00024","222222222A","222222222","DOE","JANE","RECORD ERROR BLAH"
> "00024","333333333A","333333333","DOE","DICK","RECORD ERROR BLAH"
> "00024","444444444A","444444444","DOE","RON","RECORD ERROR BLAH"

> I have file#2 - Here is what's in it:

> 00024,111111111A,111111111,DOE,JOHN,COMM ERROR
> 00024,222222222A,222222222,DOE,JANE,COMM ERROR
> 00024,333333333A,333333333,DOE,DICK,COMM ERROR
> 00024,444444444A,444444444,DOE,RON,COMM ERROR

> I need to match up these the records in file#1 using the 2nd, 4th and
> 5th delimited information with file#2 and write those results out to
> a separate file.

You want to do a relational (as in relational database) join. Do you have
a relational database? You might be able to use the join command but
some pre-processing would be needed. Or it would be simple enough
to script in awk or perl or whatever (or a shell, given this is comp.unix.shell).

There are a couple of oddities. One is the quotes in the first file but not
in the second. Another is the word "blah" in the first which suggests this
might not be the real data.

--
John.

 
 
 

Unix - Matching data string from 2 files

Post by ambroz » Tue, 27 Jun 2006 23:33:02


John,

That's for your help, unlike some. I figured out my code needed. For
those curious;

#Strip out quotes from file1
sed "s/\"//g" file1

awk -F ',' '
        { key = $2$4$5 }
         FNR == NR { seen[key] = $0; next }
         key in seen{ $0 = seen[key] } 1
        ' file2 file1 > newfile



> > Help Please..

> > I have file#1 - Here is what's in it:

> > "00024","111111111A","111111111","DOE","JOHN","RECORD ERROR BLAH"
> > "00024","222222222A","222222222","DOE","JANE","RECORD ERROR BLAH"
> > "00024","333333333A","333333333","DOE","DICK","RECORD ERROR BLAH"
> > "00024","444444444A","444444444","DOE","RON","RECORD ERROR BLAH"

> > I have file#2 - Here is what's in it:

> > 00024,111111111A,111111111,DOE,JOHN,COMM ERROR
> > 00024,222222222A,222222222,DOE,JANE,COMM ERROR
> > 00024,333333333A,333333333,DOE,DICK,COMM ERROR
> > 00024,444444444A,444444444,DOE,RON,COMM ERROR

> > I need to match up these the records in file#1 using the 2nd, 4th and
> > 5th delimited information with file#2 and write those results out to
> > a separate file.

> You want to do a relational (as in relational database) join. Do you have
> a relational database? You might be able to use the join command but
> some pre-processing would be needed. Or it would be simple enough
> to script in awk or perl or whatever (or a shell, given this is comp.unix.shell).

> There are a couple of oddities. One is the quotes in the first file but not
> in the second. Another is the word "blah" in the first which suggests this
> might not be the real data.

> --
> John.

 
 
 

1. Matching a pattern in a file and inserting variable string above the line matched?

Hello all,  here is a snipet of code for a problem i'm having.  I've
tried to use sed for this, but i cannot get the variable substitution
worked out.

print "Please enter each username(s) you wish to add: "      
read NAMES
for USER in $NAMES ; do
  if egrep "^$USER" /etc/passwd > /dev/null ; then
       ### insert sed, awk, or perl code here
  fi
done

### The layout of the file is:
root ADMIN=ALL JBP=ALL
* ADMIN=JBP JBP=ENDUSER+BU+ARC

For each $USER, I need to add the following string ABOVE
the "* ADMIN=JBP" line in the file:  $USER ADMIN=ALL JBP=ALL

Thanks in advance for any help.

jim

2. Linux on Performa 6400

3. Pattern matching and extracting the data which matches the pattern

4. tweaks for page_convert_anon

5. How exclude from file strings exactly match symbol <

6. A pictorial guide to why you should NOT run Linux

7. Xauth data does not match fake data?

8. Fun with a Packard Bell.

9. warning: X11 auth data does not match fake data

10. unix strings command on exp file (was Re: Editing an Export (.dmp) file)

11. Q: Can I change strings in files with binary data.