Using 'diff' (or anything else) to get cha

2019-01-16 11:09发布

I'd like to use 'diff' to get a both line difference between and character difference. For example, consider:

File 1

abcde
abc
abcccd

File 2

abcde
ab
abccc

Using diff -u I get:

@@ -1,3 +1,3 @@
 abcde
-abc
-abcccd
\ No newline at end of file
+ab
+abccc
\ No newline at end of file

However, it only shows me that were changes in these lines. What I'd like to see is something like:

@@ -1,3 +1,3 @@
 abcde
-ab<ins>c</ins>
-abccc<ins>d</ins>
\ No newline at end of file
+ab
+abccc
\ No newline at end of file

You get my drift.

Now, I know I can use other engines to mark/check the difference on a specific line. But I'd rather use one tool that does all of it.

13条回答
Summer. ? 凉城
2楼-- · 2019-01-16 11:30

Python has convenient library named difflib which might help answer your question.

Below are two oneliners using difflib for different python versions.

python3 -c 'import difflib, sys; \
  print("".join( \
    difflib.ndiff( \ 
      open(sys.argv[1]).readlines(),open(sys.argv[2]).readlines())))'
python2 -c 'import difflib, sys; \
  print "".join( \
    difflib.ndiff( \
      open(sys.argv[1]).readlines(), open(sys.argv[2]).readlines()))'

These might come in handy as a shell alias which is easier to move around with your .${SHELL_NAME}rc.

$ alias char_diff="python2 -c 'import difflib, sys; print \"\".join(difflib.ndiff(open(sys.argv[1]).readlines(), open(sys.argv[2]).readlines()))'"
$ char_diff old_file new_file

And more readable version to put in a standalone file.

#!/usr/bin/env python2
from __future__ import with_statement

import difflib
import sys

with open(sys.argv[1]) as old_f, open(sys.argv[2]) as new_f:
    old_lines, new_lines = old_f.readlines(), new_f.readlines()
diff = difflib.ndiff(old_lines, new_lines)
print ''.join(diff)
查看更多
Melony?
3楼-- · 2019-01-16 11:31

Python's difflib can do this.

The documentation includes an example command-line program for you.

The exact format is not as you specified, but it would be straightforward to either parse the ndiff-style output or to modify the example program to generate your notation.

查看更多
相关推荐>>
4楼-- · 2019-01-16 11:32

You can use the cmp command in Solaris:

cmp

Compare two files, and if they differ, tells the first byte and line number where they differ.

查看更多
Evening l夕情丶
5楼-- · 2019-01-16 11:32

I think the simpler solution is always a good solution. In my case, the below code helps me a lot. I hope it helps anybody else.

#!/bin/env python

def readfile( fileName ):
    f = open( fileName )
    c = f.read()
    f.close()
    return c

def diff( s1, s2 ):
    counter=0
    for ch1, ch2 in zip( s1, s2 ):
        if not ch1 == ch2:
            break
        counter+=1
    return counter < len( s1 ) and counter or -1

import sys

f1 = readfile( sys.argv[1] )
f2 = readfile( sys.argv[2] )
pos = diff( f1, f2 )
end = pos+200

if pos >= 0:
    print "Different at:", pos
    print ">", f1[pos:end]
    print "<", f2[pos:end]

You can compare two files with the following syntax at your favorite terminal:

$ ./diff.py fileNumber1 fileNumber2
查看更多
我命由我不由天
6楼-- · 2019-01-16 11:34

If you keep your files in Git, you can diff between versions with the diff-highlight script, which will show different lines, with differences highlighted.

Unfortunately it only works when the number of lines removed matches the number of lines added - there is stub code for when lines don't match, so presumably this could be fixed in the future.

查看更多
劫难
7楼-- · 2019-01-16 11:37

Python's difflib is ace if you want to do this programmatically. For interactive use, I use vim's diff mode (easy enough to use: just invoke vim with vimdiff a b). I also occaisionally use Beyond Compare, which does pretty much everything you could hope for from a diff tool.

I haven't see any command line tool which does this usefully, but as Will notes, the difflib example code might help.

查看更多
登录 后发表回答