可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I have a large text file containing a list of emails called "main", and I have sent mails to some of them. I have a list of 'sent' emails. Now, I want to remove the 'sent' emails from the list "main".

In other words, I want to remove both the matching raw from the text file while removing duplicates. Example:

I have:

email@email.com
test@test.com
email@email.com

I want:

test@test.com

Is there any easier way to achieve this? Please suggest a tool or method to do this, but please consider the text file is larger than 10MB.

回答1:

In terminal:

cat test| sort | uniq -c | awk -F" " '{if($1==1) print $2}'

回答2:

I use cygwin a lot for such tasks, as the unix command line is incredibly powerful.

Here's how to achieve what you want:

cat main.txt | sort -u | grep -Fvxf sent.txt

sort -u will remove duplicates (by sorting the main.txt file first), and grep will take care of removing the unwanted addresses.

Here's what the grep options mean:

-F plain text search
-v invert results
-x will force the whole line to match the pattern
-f read patterns from the specified file

Oh, and if your files are in the Windows format (CR LF newlines) you'll rather have to do this:

cat main.txt | dos2unix | sort -u | grep -Fvxf <(cat sent.txt | dos2unix)

Just like with the Windows command line, you can simply add:

> output.txt

at the end of the command line to redirect the output to a text file.

How to remove both matching lines while removing d

问题:

回答1:

回答2:

收藏的人(0)

How to remove both matching lines while removing d

问题:

回答1:

回答2:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮