I have always successfully sorted my arrays like this (when I did not want the standard lexicographic ordering):
var arr = […] // some numbers or so
arr.sort(function(a, b) {
return a > b;
});
Now, someone told me this was wrong, and that I would need to return a-b
instead. Is that true, and if yes why? I have tested my comparison function, and it works! Also, why would my solution be so common when it is wrong?
The
sort
function expects a function that expects two argumentsa
andb
, and returns:In order to sort numbers in ascending order
return a - b
will produce the correct return values; for example:On the other hand
return a > b
produces the following return values:In the above example, the sort function is told that 1 and 2 are same (and placing 1 before 2 or 2 before 1 does not matter). This will produce incorrect result, for example (in Chrome 49):
TL;DR
No, you have not. And didn't notice it. A quick counterexample:
Because your comparison function does return
false
(or0
, equivalently) even whenb
is larger thana
. But0
implies that the two elements are considered equal - and the sorting algorithm believes that.In-Depth Explanation
Comparison functions in JavaScript
The
Array::sort
method can take an optional, custom comparison function as its argument. That function takes two arguments (commonly referred to asa
andb
) which it should compare, and is supposed to return a number> 0
whena
is considered larger thanb
and should be sorted after it== 0
whena
is considered equal tob
and it doesn't matter which comes first< 0
whena
is considered smaller thanb
and should be sorted before itIf it does not return a number, the result will be cast to a number (which is handy for booleans). The returned number does not need to be exactly
-1
or0
or1
(though it typically is).Consistent ordering
To be consistent, the compare function would need to fulfill the equation
If that requirement is broken, the sort will behave undefined.
Citing the ES5.1 spec on
sort
(same thing in the ES6 spec):A sorting algorithm needs to compare items of the array with each other. To do a good and efficient job, it must not need to compare each item to every other, but needs to be able to reason about their ordering. For that to work well, there are a few rules that a custom comparison function needs to abide. A trivial one is that an item
a
is equal to itself (compare(a, a) == 0
) - that's the first item in the list above (reflexivity). Yes, this is a bit mathematical, but pays of well.The most important one is transitivity. It says that when the algorithm has compared two values
a
andb
, and alsob
withc
, and has found out by applying the comparison function that e.g.a = b
andb < c
, then it can expect thata < c
holds as well. This seems only logical, and is required for a well-defined, consistent ordering.But your comparison function does fail this. Lets look at this example:
Ooops. And that is why a sorting algorithm can fail (in the spec, this is be "implementation-dependent behaviour" - i.e. unpredictable results) when it is invoked with a comparison function that is not consistent.
Because in many other languages, there are sorting algorithms that don't expect a three-way comparison but merely a boolean smaller-than operator. C++
std::sort
is a good example of that. It will simply be applied twice with swapped arguments if an equality needs to be determined. Admittedly, this can be more efficient and is less error-prone, but needs more calls to the comparison function if the operator cannot be inlined.CounterExamples
Only by sheer luck, if you tried some random example. Or because your test suite is flawed - incorrect and/or incomplete.
Here is the small script I used to find the above minimal counterexample:
What comparison function is correct?
Use no comparison function at all, when you want a lexicographic sort. Items in the array will be stringified if necessary.
A generic comparison function that works like the relational operators can be implemented as
With a few tricks, this can be minified to the equivalent
function(a,b){return +(a>b)||-(a<b)}
.For numbers, you can simply return their difference, which abides all the laws above:
If you want to sort in reverse, just take the appropriate one and swap
a
withb
.If you want to sort composite types (objects etc), replace each
a
and eachb
with an access of the properties in question, or a method call or whatever you want to sort by.