I am reading the documentation for Python's difllib. According to the docs each, Differ delta gives a sequence
Code Meaning
'- ' line unique to sequence 1
'+ ' line unique to sequence 2
' ' line common to both sequences
'? ' line not present in either input sequence
But what about the "Change" operation? How do I get a "c " instruction similar to the results in Perl's sdiff?
Show this script.
sdiff.py @ hungrysnake.net
http://hungrysnake.net/doc/software__sdiff_py.html
Perl's sdiff(Algorithm::Diff) dont think about "Matching rate", but python's sdiff.py think about it. =)
I have 2 text files.
I got side by side diff by sdiff command or Perl's sdiff(Algorithm::Diff).
Sdiff dont think about "Matching rate" =(
I got it by sdiff.py
Sdiff.py think about "Matching rate" =)
I want result by sdiff.py. dont you ?
There is no direct
c
like code in difflib to show changed lines as in Perl's sdiff you talked about. But you can make one easily. In difflib's delta, the "changed lines" also have'- '
, but in contrast to the actually deleted lines, the next line in the delta is tagged with'? '
to mean that the line in the previous index of the delta is "changed", not deleted. Another purpose of this line in delta is that it acts as 'guide' as to where the changes are in the line.So, if a line in the delta is tagged with
'- '
, then there are four different cases depending on the next few lines of the delta:Case 1: The line modified by inserting some characters
Case 2: The line is modified by deleting some characters
Case 3: The line is modified by deleting and inserting and/or replacing some characters:
Case 4: The line is deleted
As you can see, the lines tagged with
'? '
show exactly where what type of modification is made.Note that difflib considers a line is deleted if the value of
ratio()
between the two lines being compared is less than 0.75. It is a value I found out by some tests.So to infer a line as changed, you can do this. This will return the diffs with changed lines tagged with code 'c ', and unchanged lines tagged as 'u ', just like in Perl's sdiff:
Hope it helps.
P.S.: It is an old question, so I am not sure how well will my efforts be awarded.
:-(
I just could not help answering this question, as I have been working a little with difflib lately.I don't know pretty much what the Perl's "Change" operation is. If it similar to PHP DIFF output, I solve my problem with this code :
Thanks @Sнаđошƒаӽ for your code.