I have the following files:
file1.txt
###################################################
Dump stat Title information for 'ssummary' view
###################################################
Tab=> 'Instance' Title=> {text {Total instances: 7831}}
Tab=> 'Device' Title=> {text {Total spice devices: 256}}
Tab=> 'Memory' Title=> {text {Total memory allocated: 962192 kB}}
Tab=> 'Cpu' Title=> {text {Total cumulative CPU time: 9030 ms}}
file2.txt
###################################################
Dump stat Title information for 'ssummary' view
###################################################
Tab=> 'Instance' Title=> {text {Total instances: 7831}}
Tab=> 'Device' Title=> {text {Total spice devices: 256}}
Tab=> 'Memory' Title=> {text {Total memory allocated: 9621932 kB}}
Tab=> 'Cpu' Title=> {text {Total cumulative CPU time: 90303 ms}}
And I'm running the following command:
diff -I 'Memory' file1.txt file2.txt
which outputs:
6,7c6,7
< Tab=> 'Memory' Title=> {text {Total memory allocated: 962192 kB}}
< Tab=> 'Cpu' Title=> {text {Total cumulative CPU time: 9030 ms}}
---
> Tab=> 'Memory' Title=> {text {Total memory allocated: 9621932 kB}}
> Tab=> 'Cpu' Title=> {text {Total cumulative CPU time: 90303 ms}}
However my expected output is:
< Tab=> 'Cpu' Title=> {text {Total cumulative CPU time: 9030 ms}}
---
> Tab=> 'Cpu' Title=> {text {Total cumulative CPU time: 90303 ms}}
Note that in the command if I change 'Memory' to 'Tab' or 'Title' problem's solved, but probably all lines are ignored cause they all have Tab and Title.
Well you learn something new every day. I was equally confused and frustrated by this behaviour, which seems to be roughly [diff the input files, then filter out the RE] rather than [filter the RE out of the input files, then diff].
I would have thought the second approach more natural and more useful. For instance this seems to be the way --ignore-case and --strip-trailing-cr work, adjusting the input files before diffing. Additionally, actually achieving what the questioner wanted involves filtering both inputs to temp files, diffing them, then removing them. It becomes even more tedious if you want to do a recursive diff as I did.
I acknowledge that diff behaves the way it's documented rather than how I want it to behave, but respectfully suggest that this option (and similar for -b, -w too) could usefully be added to diff.
This is expected behaviour as per
diffutils
manual:You may try to set a smaller set of changes by specifying
-d
, but in your example it won't work.From man diff, if I recall well, the -I just ignores the reg exp contained in it. Which means that if f1 is:
and f2 is:
would correctly parse:
giving nothing
BUT
if f2 now becomes
the regexp is not matched anymore (cable and table are not matched by the regexp...) and so u would have the two lines coming up in the output...
So, just try to change the command in:
that should do the trick (sorry for the stupid examples..)
This behaviour looks a bit weird indeed. I noticed something by tweaking your input files (I just moved the "Memory" line to the top on both files) :
file1.txt
file2.txt
A plain diff will give you :
Notice that there are two sets of differences now... with that arrangement, the
diff -I 'Memory' file1.txt file2.txt
command will work and output this :Meaning, the
-I
flag seems to work only when every line in a set of differences matches the expression. I don't know if this is a bug or expected behaviour... but it's certainly inconsistent.EDIT : actually, as per the GNU diff documentation, it IS the expected behavior. The man page is not so clear. OpenBSD diff has a
-I
flag too, but their man page explains it better.This behaviour is normal given the way
diff
works (as of April 2013).diff
is line oriented, it means that a line is either considered totally different or totally equivalent. When a line is ignored, it is entered into the list of different lines before comparison, and when the change script is computed, changes made only of ignored lines are considered themselves as ignored. When ignored lines are adjacent to changed lines, it makes up a single non-ignored change.The problem lies in the inability of
diff
to understand that consecutive lines are not related: you are not diffing a sequence of text (whatdiff
is aimed at), but rather a list of independent lines which are keyed (Tab >= <key>
). These problems seem pretty similar when both files are generated in the same order, but still not the same.