I have a data with the following format:
foo<tab>1.00<space>1.33<space>2.00<tab>3
Now I tried to sort the file based on the last field decreasingly. I tried the following commands but it wasn't sorted as we expected.
$ sort -k3nr file.txt # apparently this sort by space as delimiter
$ sort -t"\t" -k3nr file.txt
sort: multi-character tab `\\t'
$ sort -t "`/bin/echo '\t'`" -k3,3nr file.txt
sort: multi-character tab `\\t'
What's the right way to do it?
Here is the sample data.
In general keeping data like this is not a great thing to do if you can avoid it, because people are always confusing tabs and spaces.
Solving your problem is very straightforward in a scripting language like Perl, Python or Ruby. Here's some example code:
If you want to make it easier for yourself by only having tabs, replace the spaces with tabs:
pipe it through something like
awk '{ print print $1"\t"$2"\t"$3"\t"$4"\t"$5 }'
. This will change the spaces to tabs.By default the field delimiter is non-blank to blank transition so tab should work just fine.
However, the columns are indexed base 1 and base 0 so you probably want
to sort file.txt by column 4 numerically in reverse order. (Though the data in the question has even 5 fields so the last field would be index 5.)
I was having this problem with sort in cygwin in a bash shell when using 'general-numeric-sort'. If I specified
-t$'\t' -kFg
, where F is the field number, it didn't work, but when I specified both-t$'\t'
and-kF,Fg
(e.g-k7,7g
for the 7th field) it did work.-kF,Fg
without the-t$'\t'
did not work.Using bash, this will do the trick:
Notice the dollar sign in front of the single-quoted string. You can read about it in the ANSI-C Quoting sections of the bash man page.