Compare two files line by line and generate the di

I want to compare file1 with file2 and generate a file3 which contains the lines in file1 which are not present in file2.

标签： shell unix

9条回答

萌系小妹纸

2楼-- · 2019-01-07 02:16

Try

sdiff file1 file2

It ususally works much better in most cases for me. You may want to sort files prior, if order of lines is not important (e.g. some text config files).

For example,

sdiff -w 185 file1.cfg file2.cfg

0人赞添加讨论(0) 举报

走好不送

3楼-- · 2019-01-07 02:16

If you need to solve this with coreutils the accepted answer is good:

comm -23 <(sort file1) <(sort file2) > file3

You can also use sd (stream diff), which doesn't require sorting nor process substitution and supports infinite streams, like so:

cat file1 | sd 'cat file2' > file3

Probably not that much of a benefit on this example, but still consider it; in some cases you won't be able to use comm nor grep -F nor diff.

Here's a blogpost I wrote about diffing streams on the terminal, which introduces sd.

0人赞添加讨论(0) 举报

Ridiculous、

4楼-- · 2019-01-07 02:20

Consider this:
file a.txt:

abcd
efgh

file b.txt:

abcd

You can find the difference with:

diff -a --suppress-common-lines -y a.txt b.txt

The output will be:

efgh

You can redirict the output in an output file (c.txt) using:

diff -a --suppress-common-lines -y a.txt b.txt > c.txt

This will answer your question:

"...which contains the lines in file1 which are not present in file2."

0人赞添加讨论(0) 举报

ら.Afraid

5楼-- · 2019-01-07 02:26

The Unix utility diff is meant for exactly this purpose.

$ diff -u file1 file2 > file3

See the manual and the Internet for options, different output formats, etc.

0人赞添加讨论(0) 举报

放荡不羁爱自由

6楼-- · 2019-01-07 02:30

Many answers already, but none of them perfect IMHO. Thanatos' answer leaves some extra characters per line and Sorpigal's answer requires the files to be sorted or pre-sorted, which may not be adequate in all circumstances.

I think the best way of getting the lines that are different and nothing else (no extra chars, no re-ordering) is a combination of diff, grep, and awk (or similar).

If the lines do not contain any "<", a short one-liner can be:

diff urls.txt* | grep "<" | sed 's/< //g'

but that will remove every instance of "< " (less than, space) from the lines, which is not always OK (e.g. source code). The safest option is to use awk:

diff urls.txt* | grep "<" | awk '{for (i=2; i<NF; i++) printf $i " "; print $NF}'

This one-liner diffs both files, then filters out the ed-style output of diff, then removes the trailing "<" that diff adds. This works even if the lines contains some "<" themselves.

0人赞添加讨论(0) 举报

我欲成王，谁敢阻挡

7楼-- · 2019-01-07 02:36

Use the Diff utility and extract only the lines starting with < in the output

0人赞添加讨论(0) 举报

1 2 下一页

Compare two files line by line and generate the di

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间