How to merge two files with different fields based

I have two files as below and I want to merge them into one file based on common IDs:

File1:

ARS     8.0   8.0
BBL     1.1   1.2
CCL     1.9   1.8

File2:

ARS     2.3   2.4
ARS     2.6   2.4
ARS     2.5   2.3
BBL     1.9   1.8
EDE     1.4   1.6

Desired output:

ARS     8.0   8.0    ARS     2.3   2.4
ARS     8.0   8.0    ARS     2.6   2.4   
ARS     8.0   8.0    ARS     2.5   2.3
BBL     1.1   1.2    BBL     1.9   1.8
CCL     1.9   1.8    NA
NA                   EDE     1.4   1.6

标签： linux awk

3条回答

Root（大扎）

2楼-- · 2019-09-27 14:25

Could you please try following and let me know if this helps you.(I have kept the order of NA at last only let me know if this helps you). Adding a non-one liner form of solution too here.

awk '
FNR==NR{
  a[$1]=$0;
  next
}
($1 in a){
  print a[$1],$0;
  b[$1];
  next
}
{
  print "NA\t",$0       
}
END{
  for(i in b){
    delete a[i]
};
  for(j in a){
  print a[j],"\tNA"
}
}
' file1  file2

Output will be as follows.

ARS     8.0   8.0 ARS     2.3   2.4
ARS     8.0   8.0 ARS     2.6   2.4
ARS     8.0   8.0 ARS     2.5   2.3
BBL     1.1   1.2 BBL     1.9   1.8
NA   EDE     1.4   1.6
CCL     1.9   1.8   NA

0人赞添加讨论(0) 举报

劳资没心，怎么记你

3楼-- · 2019-09-27 14:32

There is a tool for merging files

join -a 1 -a 2 -e NA t24.in1 t24.in2  -o 1.1,1.2,1.3,2.1,2.2,2.3

output:

ARS     8.0     8.0     ARS     2.3     2.4
ARS     8.0     8.0     ARS     2.6     2.4
ARS     8.0     8.0     ARS     2.5     2.3
BBL     1.1     1.2     BBL     1.9     1.8
CCL     1.9     1.8     NA      NA      NA
NA      NA      NA      EDE     1.4     1.6

0人赞添加讨论(0) 举报

Bombasti

4楼-- · 2019-09-27 14:44

$ cat tst.awk
BEGIN { FS=OFS="\t" }
FNR==1 {
    na = $0
    gsub("[^"FS"]","",na)
    nas[++numFiles] = "NA" na
}
NR==FNR { file1[$1] = $0; next }
$1 in file1 { print file1[$1], $0 }
{ file2[$1] = $0 }
END {
    for (key in file1) {
        if ( !(key in file2) ) {
            print file1[key], nas[2]
        }
    }
    for (key in file2) {
        if ( !(key in file1) ) {
            print nas[1], file2[key]
        }
    }
}

$ awk -f tst.awk file1 file2
ARS     8.0     8.0     ARS     2.3     2.4
ARS     8.0     8.0     ARS     2.6     2.4
ARS     8.0     8.0     ARS     2.5     2.3
BBL     1.1     1.2     BBL     1.9     1.8
CCL     1.9     1.8     NA
NA                      EDE     1.4     1.6

There's various ways it could be optimized to not store the whole contents of both files in arrays but I like the simplicity and symmetry of the above and it's trivial to optimize later if that proves to be necessary due to massive input files.

0人赞添加讨论(0) 举报

How to merge two files with different fields based

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间