Print lines that contain a value in a specific col

I want to extract only those values in Column 2 that are shared by at least 2 unique values in Column 2.

Using the same input (in this case 3- tab-separated columns):

waterline-n    below-sheath-v    14.8097 
dock-n    below-sheath-v     14.5095 
waterline-n    below-steel-n    11.0330 
picnic-n    below-steel-n    12.2277 
wavefront-n    at-part-of-variance-n    18.4888 
wavefront-n    between-part-of-variance-n    17.0656
audience-b    between-part-of-variance-n    17.6346 
game-n    between-part-of-variance-n    14.9652 
whereabouts-n    become-rediscovery-n    11.3556 
whereabouts-n    get-tee-n    10.9091

For the following desired output:

waterline-n    below-sheath-v    14.8097 
dock-n    below-sheath-v     14.5095 
waterline-n    below-steel-n    11.0330
picnic-n    below-steel-n    12.2277 
wavefront-n    between-part-of-variance-n    17.0656 
audience-b    between-part-of-variance-n    17.6346 
game-n    between-part-of-variance-n    14.9652

Is it possible to do this using grep?

标签： terminal grep

2条回答

迷人小祖宗

2楼-- · 2019-09-10 17:30

Reading the file twice with awk and using array.
I think this would be hard to do with grep only.

awk 'FNR==NR {a[$2]++;next} a[$2]>1' file file
waterline-n    below-sheath-v    14.8097
dock-n    below-sheath-v     14.5095
waterline-n    below-steel-n    11.0330
picnic-n    below-steel-n    12.2277
wavefront-n    between-part-of-variance-n    17.0656
audience-b    between-part-of-variance-n    17.6346
game-n    between-part-of-variance-n    14.9652

In first pass FNR==NR it adds all the value of column 2 in an array, and increment it for every hits that passes.
In pass two it looks in the array and see if hits is more than one and if ok, print the line.

0人赞添加讨论(0) 举报

你好瞎i

3楼-- · 2019-09-10 17:38

You can get the desired output with grep and uniq. Note that there should be no correspondence between the second column and other columns. Also note that the identical fields need to be on consecutive lines unless you sort the output of cut:

grep -f <(cut -f2 infile | uniq -d) infile

Output:

waterline-n below-sheath-v  14.8097
dock-n  below-sheath-v  14.5095
waterline-n below-steel-n   11.0330
picnic-n    below-steel-n   12.2277
wavefront-n between-part-of-variance-n  17.0656
audience-b  between-part-of-variance-n  17.6346
game-n  between-part-of-variance-n  14.9652

0人赞添加讨论(0) 举报

Print lines that contain a value in a specific col

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间