I'm trying to sort a big table stored in a file. The format of the file is (ID, intValue)
The data is sorted by ID
, but what I need is to sort the data using the intValue
, in descending order.
For example
ID | IntValue
1 | 3
2 | 24
3 | 44
4 | 2
to this table
ID | IntValue
3 | 44
2 | 24
1 | 3
4 | 2
How can I use the Linux sort
command to do the operation? Or do you recommend another way?
How about:
where test.txt
gives sorting in descending order (option -r)
while this sorts in ascending order (without option -r)
in case you have duplicates
use the uniq command like this
As others have already pointed out, see
man sort
for-k
&-t
command line options on how to sort by some specific element in the string.Now, the
sort
also has facility to help sort huge files which potentially don't fit into the RAM. Namely the-m
command line option, which allows to merge already sorted files into one. (See merge sort for the concept.) The overall process is fairly straight forward:Split the big file into small chunks. Use for example the
split
tool with the-l
option. E.g.:split -l 1000000 huge-file small-chunk
Sort the smaller files. E.g.
for X in small-chunk*; do sort -t'|' -k2 -nr < $X > sorted-$X; done
Merge the sorted smaller files. E.g.
sort -t'|' -k2 -nr -m sorted-small-chunk* > sorted-huge-file
Clean-up:
rm small-chunk* sorted-small-chunk*
The only thing you have to take special care about is the column header.