Bash join command

2020-02-10 05:17发布

问题:

Infile1:

1 a
3 c
4 d
6 f

Infile2:

1 a 
2 b
5 e
6 f
7 g
8 h

How do I join these files with the unix join command to get this output:

1 aa
2 b
3 c
4 d
5 e
6 ff
7 g 
8 h

Dogbanes answer worked but... when I apply dogbanes answer on this file:

27  27
28  22
29  37
30  15
31  21
32  13
33  18
34  24

and this:

27  7
28  13
29  6
30  12
31  30
32  5
33  10
34  28

They don't join:

27  27
27  7
28  13
28  22
29  37
29  6
30  12
30  15
31  21
31  30
32  13
32  5
33  10
33  18
34  24
34  28

The second scenario is tab delimited so I used -t \t

回答1:

First sort both files. Then use join to join on the first field of both files. You also need to pipe the output through sed if you want to remove the space and thus convert a a into aa. This is shown below:

$ join -t " " -1 1 -2 1 -a 1 -a 2  <(sort file1) <(sort file2) | sed 's/ \([a-z]\) / \1/g'
1 aa
2 b
3 c
4 d
5 e
6 ff
7 g
8 h


回答2:

Works for me (almost). You should specify -t $'\t' for the tab character, not just -t \t. Bash does not interpret \t unless in $'' quotes.

join -t $'\t' -o 1.2,2.2 <(echo  $'27\t27
28\t22
29\t37
30\t15
31\t21
32\t13
33\t18
34\t24' | sort) <(echo $'27\t7
28\t13
29\t6
30\t12
31\t30
32\t5
33\t10
34\t28' | sort)
27      7
22      13
37      6
15      12
21      30
13      5
18      10
24      28


回答3:

this should work for your both cases:

awk 'NR==FNR{a[$1]=$2;next;} {a[$1]=($1 in a)?a[$1]$2:$2}END{for(x in a)print x,a[x]}' f1 f2|sort

output for case one:

1 aa
2 b
3 c
4 d
5 e
6 ff
7 g
8 h

output for case two:

27 277
28 2213
29 376
30 1512
31 2130
32 135
33 1810
34 2428