Data manipulation in linux [duplicate]

2019-08-27 14:54发布

站内文章 / Linux

32 0

男人必须洒脱

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

This question already has an answer here:

Using awk, remove lines with duplicate pair of columns in different indexes 2 answers

I am trying to filter or remove some lines in a text file based on some criteria (tried with awk, but no success). I have a file that contains some columns separated by a comma ,. An example of such a file is:

source,destination
192.168.1.2,8.8.8.8
8.8.8.8,192.168.1.2

I am interested to remove or filter out those lines where the information is the same.

so if the file contains the reversed source destination:

192.168.1.2,8.8.8.8
8.8.8.8,192.168.1.2

then only show one of the lines, not both.

回答1:

You can try this but be careful if the file is huge as it keeps the values in memory.

awk -F, '!($1 FS $2 in dup){dup[$1 FS $2]=dup[$2 FS $1]; print}' <file>

Same idea :

awk -F, '!(($1 FS $2 in dup)||($2 FS $1 in dup)){dup[$1 FS $2]; print}' <file>

标签： linux csv awk sed

男人必须洒脱

女 | 书童

私信

收藏的人(0)

Ta的文章更多文章

0条评论

还没有人评论过~

Data manipulation in linux [duplicate]

问题:

回答1:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮