how to join two files using two columns as key

how to join two files using two columns as key

Post by Yong Li » Sat, 26 Apr 2003 04:27:05



Hi

I have two tab-delimited flat files. Both have column x coordinate and y
coordinate. I want to join them by using their coordinate as the key
(which should include both x and y coordinate). Is there way to do this
using shell commands sort and join, etc? Thanks a lot! (only those
coordinates in file1 will be appeared in final merged results). -Yong

File 1:
x y number1
1 2   343.5
1 3   444.2
0 1   33.4

File 2:

x y number2
0 1 333
1 2 444
1 3 555
0 8 666

Merge results:
x y number1 number2
1 2 343.5    444
1 3 444.2    555
0 1 33.4     333

 
 
 

how to join two files using two columns as key

Post by William Par » Sat, 26 Apr 2003 04:55:24



> Hi

> I have two tab-delimited flat files. Both have column x coordinate and y
> coordinate. I want to join them by using their coordinate as the key
> (which should include both x and y coordinate). Is there way to do this
> using shell commands sort and join, etc? Thanks a lot! (only those
> coordinates in file1 will be appeared in final merged results). -Yong

> File 1:
> x y number1
> 1 2   343.5
> 1 3   444.2
> 0 1   33.4

> File 2:

> x y number2
> 0 1 333
> 1 2 444
> 1 3 555
> 0 8 666

> Merge results:
> x y number1 number2
> 1 2 343.5    444
> 1 3 444.2    555
> 0 1 33.4     333

1. Shell:
    - read and print 'x', 'y', 'number1'
    - grep "^$x[ \t]+$y[ \t]" file2
    - print out 'number2'

2. Awk:
    - read file1 and store 'x,y':'number1' (key:value) in associative
      array 'var1'
    - read file2 and store 'x,y':'number2' (key:value) in associative
      array 'var2'
    - for each 'x,y' key in var1, print var1["x,y"] and var2["x,y"]

3. Python:
    - read file1 and file2, and store '(x,y)':'number' in dictionary
      'var1' and 'var2'.
    - for each '(x,y)' key in var1, print var1[(x,y)] and var2[(x,y)]

--

Linux solution for data management and processing.

 
 
 

how to join two files using two columns as key

Post by command » Sat, 26 Apr 2003 19:11:49



> Hi

> I have two tab-delimited flat files. Both have column x coordinate and y
> coordinate. I want to join them by using their coordinate as the key
> (which should include both x and y coordinate). Is there way to do this
> using shell commands sort and join, etc? Thanks a lot! (only those
> coordinates in file1 will be appeared in final merged results). -Yong

> File 1:
> x y number1
> 1 2   343.5
> 1 3   444.2
> 0 1   33.4

> File 2:

> x y number2
> 0 1 333
> 1 2 444
> 1 3 555
> 0 8 666

> Merge results:
> x y number1 number2
> 1 2 343.5    444
> 1 3 444.2    555
> 0 1 33.4     333

Assume 2 files are sorted, you can try
grep "^`cut -d' ' -f 1,2 FILE1`" | awk '{print $3}' > FILE3
paste FILE1 FILE3
 
 
 

how to join two files using two columns as key

Post by Stephane CHAZELA » Sat, 26 Apr 2003 20:12:14



> I have two tab-delimited flat files. Both have column x coordinate and y
> coordinate. I want to join them by using their coordinate as the key
> (which should include both x and y coordinate). Is there way to do this
> using shell commands sort and join, etc? Thanks a lot! (only those
> coordinates in file1 will be appeared in final merged results). -Yong

> File 1:       > File 2:         > Merge results:
> x y number1   >                 > x y number1 number2
> 1 2   343.5   > x y number2     > 1 2 343.5    444
> 1 3   444.2   > 0 1 333         > 1 3 444.2    555
> 0 1   33.4    > 1 2 444         > 0 1 33.4     333
>               > 1 3 555
>               > 0 8 666

You'll have to first merge the key fields

T=`printf '\t'`
join -t"$T" <(tail +2 $file1 | sed "s/$T/,/" | sort) \
            <(tail +2 $file2 | sed "s/$T/,/" | sort) \
  | sed "s/,/$T/"

if your shell doesn't have process substitution (<(...)), use
fifos or tempfiles.

--
Stphane

 
 
 

how to join two files using two columns as key

Post by rakesh shar » Wed, 30 Apr 2003 06:25:53



> Hi

> I have two tab-delimited flat files. Both have column x coordinate and y
> coordinate. I want to join them by using their coordinate as the key
> (which should include both x and y coordinate). Is there way to do this
> using shell commands sort and join, etc? Thanks a lot! (only those
> coordinates in file1 will be appeared in final merged results). -Yong

> File 1:
> x y number1
> 1 2   343.5
> 1 3   444.2
> 0 1   33.4

> File 2:

> x y number2
> 0 1 333
> 1 2 444
> 1 3 555
> 0 8 666

> Merge results:
> x y number1 number2
> 1 2 343.5    444
> 1 3 444.2    555
> 0 1 33.4     333

#!/usr/local/bin/perl

open(FA,"< fileA");
open(FB,"< fileB");



print map {$_,"\t",$f{join ",",(split)[0,1]},"\n"} grep {exists

 
 
 

how to join two files using two columns as key

Post by rakesh shar » Wed, 30 Apr 2003 11:07:51



> Hi

> I have two tab-delimited flat files. Both have column x coordinate and y
> coordinate. I want to join them by using their coordinate as the key
> (which should include both x and y coordinate). Is there way to do this
> using shell commands sort and join, etc? Thanks a lot! (only those
> coordinates in file1 will be appeared in final merged results). -Yong

> File 1:
> x y number1
> 1 2   343.5
> 1 3   444.2
> 0 1   33.4

> File 2:

> x y number2
> 0 1 333
> 1 2 444
> 1 3 555
> 0 8 666

> Merge results:
> x y number1 number2
> 1 2 343.5    444
> 1 3 444.2    555
> 0 1 33.4     333