Efficient way to map ids

2019-02-27 02:55发布

I have two text files,

File 1 with data like

User game count

A Rugby 2
A Football 2
B Volleyball 1
C TT 2
...

File 2

1 Basketball
2 Football
3 Rugby
...
90 TT
91 Volleyball
...

Now what I want to do is add another column to File 2 such that I have the corresponding index of the game from File 2 as an extra column in File 1.

I have 2 million entries in File 1. So I want to add another column specifying the index(basically the line number or order) of the game from file 2. How can I do this efficiently.

Right now I am doing this line by line. Reading a line from file 1, grep the corresponding game from file 2 for its line number and saving/writing that to a file.

This will take me ages. How can I speed this up?

标签: bash shell
2条回答
虎瘦雄心在
2楼-- · 2019-02-27 03:09

Untested

awk 'NR==FNR{a[$2]=$1;next}{print $0,a[$2]}' file2 file1
查看更多
我欲成王,谁敢阻挡
3楼-- · 2019-02-27 03:28

Your File2 should have no records duplicated, such as no two football index records.

awk 'FNR==NR{a[$2]=$1;next}$0=$0 FS a[$2]' file2 file1
查看更多
登录 后发表回答