This question already has an answer here:
- Inner join on two text files 5 answers
File 1 has 5 fields A B C D E, with field A is an integer-valued
File 2 has 3 fields A F G
The number of rows in File 1 is much bigger than that of File 2 (20^6 to 5000)
All the entries of A in File 1 appeared in field A in File 2
I like to merge the two files by field A and carry F and G
Desired output is A B C D E F G
Example
File 1
A B C D E
4050 S00001 31228 3286 0
4050 S00012 31227 4251 0
4049 S00001 28342 3021 1
4048 S00001 46578 4210 0
4048 S00113 31221 4250 0
4047 S00122 31225 4249 0
4046 S00344 31322 4000 1
File 2
A F G
4050 12.1 23.6
4049 14.4 47.8
4048 23.2 43.9
4047 45.5 21.6
Desired output
A B C D E F G
4050 S00001 31228 3286 0 12.1 23.6
4050 S00012 31227 4251 0 12.1 23.6
4049 S00001 28342 3021 1 14.4 47.8
4048 S00001 46578 4210 0 23.2 43.9
4048 S00113 31221 4250 0 23.2 43.9
4047 S00122 31225 4249 0 45.5 21.6
Thankfully, you don't need to write this at all. Unix has a join command to do this for you.
Here it is "in action":
You need to read the entries from File 2 into a pair of associative arrays in the BEGIN block. Assuming GNU Awk:
In the main processing block, you read the line from File 1 and print it with the correct data from the arrays created in the BEGIN block:
Supply File 1 as the filename argument to the program.
The quotes around the file name argument are needed because of the spaces in the file name. You need the quotes around the
getline
filename even if it contained no spaces as it would otherwise be a variable name.