I am trying to use comm to compute the difference between two sorted files, however the result doesn't make sense, what's wrong? I want to show the strings that exists in test2 but not test1, and then show the strings that exist in test1 but not test2
>test1 a b d g >test2 e g k p >comm test1 test2 a b d e g g k p
Add a character in common between the 2 files, say 'z' at the end. You'll see that a 3rd columns appears, to indicate that that value is common to both.
The output is meant to show 'data in col1 is uniq to file1', while 'data in col2 is unique to file2'.
Finally, arguments to comm '-1, -2, -3' mean suppress output from column numbered supplied, for example, -1.
I hope this helps.
To show the lines that exist in
test2
but not intest1
, write either of these:(
-1
hides the column with lines that exist only in the first file;-2
hides the column with lines that exist only in the second file;-3
hides the column with lines that exist in both files.)And, vice versa to show the lines that exist in
test1
but not intest2
.Note that
g
on a line by itself is considered distinct fromg
with a space after it, which is why you getinstead of