Efficient way to map ids

2019-02-27 02:55发布

I have two text files,

File 1 with data like

User game count

A Rugby 2
A Football 2
B Volleyball 1
C TT 2
...

File 2

1 Basketball
2 Football
3 Rugby
...
90 TT
91 Volleyball
...

Now what I want to do is add another column to File 2 such that I have the corresponding index of the game from File 2 as an extra column in File 1.

I have 2 million entries in File 1. So I want to add another column specifying the index(basically the line number or order) of the game from file 2. How can I do this efficiently.

Right now I am doing this line by line. Reading a line from file 1, grep the corresponding game from file 2 for its line number and saving/writing that to a file.

This will take me ages. How can I speed this up?

标签： bash shell

2条回答

虎瘦雄心在

2楼-- · 2019-02-27 03:09

Untested

awk 'NR==FNR{a[$2]=$1;next}{print $0,a[$2]}' file2 file1

0人赞添加讨论(0) 举报

我欲成王，谁敢阻挡

3楼-- · 2019-02-27 03:28

Your File2 should have no records duplicated, such as no two football index records.

awk 'FNR==NR{a[$2]=$1;next}$0=$0 FS a[$2]' file2 file1

0人赞添加讨论(0) 举报

Efficient way to map ids

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间