I've used the comm command to compare two files, but I'm unable to pipe it to a third file:
comm file1 file2 > file3
comm: file 1 is not in sorted order
comm: file 2 is not in sorted order
How do I do this? The files are sorted already.
(comm file1 file2 works and prints it out)
sample input:
file1:
21
24
31
36
40
87
105
134
...
file2:
10
21
31
36
40
40
87
103
...
comm file1 file2: works
comm file1 file2 > file3
comm: file 1 is not in sorted order
comm: file 2 is not in sorted order
Try :
I don't get the same results as you, but perhaps your version of
comm
is complaining that the files are not sorted lexically. Using the input you provided (the...
makes it interesting, I know it's not a part of your actual files.)I was surprised that
...
wasn't in the third column, so I tried:That's better, but 105 > 24, right?
I think those were the results you are looking for. The two
40
s are also interesting. If you want to eliminate these:Your sample data is NOT sorted lexicographically (like in a dictionary), which is what commands like
comm
andsort
(without the-n
option) expect, where for example 100 should be before 20.Are you sure that you aren't simply not noticing the error message when you don't redirect the output, since the error would be intermixed with the output lines on the terminal?
You have to sort the files first with the
sort
program.I ran into a similar issue, where
comm
was complaining even though I had runsort
. The problem was that I was running Cygwin, andsort
pointed to some MSDOS version (I guess). By using the specific path (C:\Cygwin\bin\sort in my case), it worked.You've sorted numerically;
comm
works on lexically sorted files.For instance, in
file2
, the line 103 is dramatically out of order with the lines 21..87. Your files must be 'plainsort
sorted'.If you've got
bash
(4.x), you can use process substitution:This runs the two commands and ensures that the
comm
process gets to read their standard output as if they were files.Failing that:
This uses parallelism to get the file sorted at the same time. The sub-shell (in
( ... )
) ensures that you don't end up waiting for other background processes to finish.