filtering out values from a column iteratively in

2019-08-27 03:32发布

站内文章 / 前端开发

10 0

戒情不戒烟

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

shell scipt newbie here. I have a set of csv files in a folder, What I want is to select 1000 distinct user Ids from each file in an iterative way so that the next set of user ids picked from the next file does not contain the user ids filtered from the previous files. I have selected distinct 1000 user id from the first file and stored it in a temp file. Below is the command:

sort -u -t, -k1,8 file1.csv|head -1000 > temp.txt

Here 8 is the user id column. Now I want next 1000 user ids from file2 such that the user ids from file1( stored in temp.txt) are excluded from file2. Is there an elegant way to achieve this?

回答1:

-k1,8 use 1st and then 8th column. Don't you want to use just -k8? According to your question, try:

cut -d"," -f 8 file2.csv | grep -v -f temp.txt | sort -u | head -1000 > temp2.txt

BTW you can use wildcard in sort: sort -u -t, -k8 file*.csv | head ...

标签： shell unix scripting

戒情不戒烟

女 | 书童

私信

收藏的人(0)

Ta的文章更多文章

0条评论

还没有人评论过~

filtering out values from a column iteratively in

问题:

回答1:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮