Bash, Linux, Need to remove lines from one file ba

2020-02-13 05:55发布

There are many examples on how to remove lines in one file when that same line exists in another file. I have read through them and they all remove if the full line matches. Examples like: grep -vxF -f file1 file2

What I have is slightly different. I have a list of URLs from my websites and my clients websites. I want to remove lines from that file when the domain matches a domain in another file.

So the first file might look like:

http://www.site1.com/some/path
http://www.site2.com/some/path
http://www.site3.com/some/path
http://www.site4.com/some/path

The second file could be:

site2.com
www.site4.com

I would like the output to be:

http://www.site1.com/some/path
http://www.site3.com/some/path

标签: linux bash grep
3条回答
不美不萌又怎样
2楼-- · 2020-02-13 06:41

You have too many grep flags. Specifically: -x will keep you from getting your desired results.

Assuming that file1 has the patterns, and file2 has the URLs, just use:

grep -v -f file1 file2

The -x flag will keep you from getting the results that you want: using -x means: match only against the entire line, i.e. only match a line if the line is exactly, e.g. site2.com.

From the man grep:

-x, --line-regexp

Select only those matches that exactly match the whole line.

查看更多
不美不萌又怎样
3楼-- · 2020-02-13 06:49

There may be some corner cases this doesn't handle, but you can simply use the -v and -f options of grep:

grep -f file2.txt -v file1.txt
查看更多
等我变得足够好
4楼-- · 2020-02-13 06:51

The following should work (untested):

#!/usr/bin/perl

use strict;

open my $fh, "<$ARGV[1]" || die $!;

my $filter=join "|", <$fh>;

close $fh;

open $fh, "<$ARGV[0]" || die $!;

print grep !m{^http://[^/]*($filter)/}x, <$fh>;

close $fh;
查看更多
登录 后发表回答