Mostly curious.
I've noticed (at least in py 2.6 and 2.7) that a float
has all the familiar rich comparison functions: __lt__()
, __gt__
, __eq__
, etc.
>>> (5.0).__gt__(4.5)
True
but an int
does not
>>> (5).__gt__(4)
Traceback (most recent call last):
File "<input>", line 1, in <module>
AttributeError: 'int' object has no attribute '__gt__'
Which is odd to me, because the operator itself works fine
>>> 5 > 4
True
Even strings support the comparison functions
>>> "hat".__gt__("ace")
True
but all the int
has is __cmp__()
Seems strange to me, and so I was wondering why this came to be.
Just tested and it works as expected in python 3, so I am assuming some legacy reasons. Still would like to hear a proper explanation though ;)
If we look at the PEP 207 for Rich Comparisions there is this interesting sentence right at the end:
The inlining already present which deals with integer comparisons would still apply, resulting in no performance cost for the most common cases.
So it seems that in 2.x there is an optimisation for integer comparison. If we take a look at the source code we can find this:
case COMPARE_OP:
w = POP();
v = TOP();
if (PyInt_CheckExact(w) && PyInt_CheckExact(v)) {
/* INLINE: cmp(int, int) */
register long a, b;
register int res;
a = PyInt_AS_LONG(v);
b = PyInt_AS_LONG(w);
switch (oparg) {
case PyCmp_LT: res = a < b; break;
case PyCmp_LE: res = a <= b; break;
case PyCmp_EQ: res = a == b; break;
case PyCmp_NE: res = a != b; break;
case PyCmp_GT: res = a > b; break;
case PyCmp_GE: res = a >= b; break;
case PyCmp_IS: res = v == w; break;
case PyCmp_IS_NOT: res = v != w; break;
default: goto slow_compare;
}
x = res ? Py_True : Py_False;
Py_INCREF(x);
}
else {
slow_compare:
x = cmp_outcome(oparg, v, w);
}
So it seems that in 2.x there was an existing performance optimisation - by allowing the C code to compare integers directly - which would not have been preserved if the rich comparison operators had been implemented.
Now in Python 3 __cmp__
is no longer supported so the rich comparison operators must there. Now this does not cause a performance hit as far as I can tell. For example, compare:
Python 2.7.1 (r271:86832, Jun 16 2011, 16:59:05)
[GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import timeit
>>> timeit.timeit("2 < 1")
0.06980299949645996
to:
Python 3.2.3 (v3.2.3:3d0686d90f55, Apr 10 2012, 11:25:50)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import timeit
>>> timeit.timeit("2 < 1")
0.06682920455932617
So it seems that similar optimisations are there but my guess is the judgement call was that putting them all in the 2.x branch would have been too great a change when backwards compatibility was a consideration.
In 2.x if you want something like the rich comparison methods you can get at them via the operator
module:
>>> import operator
>>> operator.gt(2,1)
True
__cmp__()
is the old-fashioned way of doing comparisons, and is deprecated in favor of the rich operators (__lt__
, __le__
etc.) which were only introduced in Python 2.1. Likely the transition was not complete as of 2.7.x -- whereas in Python 3.x __cmp__
is completely removed.
Haskell has the most elegant implementation I've seen -- to be an Ord
(ordinal) data type, you just need to define how <
and =
works, and the typeclass itself supplies default implementations for <=
, >
and >=
in terms of those two (which you're more than welcome to define yourself if you want). You can write such a class yourself in Python, not sure why that's not the default; probably performance reasons.
As hircus said, the __cmp__
style comparisons are deprecated in favor of the rich operators (__lt__
, …) in Python 3. Originally, comparisons were implemented using __cmp__
, but there are some types/situations where a simple __cmp__
operator isn't enough (e.g. instances of a Color class could support ==
and !=
, but not <
or >
), so the rich comparison operators were added, leaving __cmp__
in place for backwards compatibility. Following the python philosophy of "There should be one-- and preferably only one --obvious way to do it,"1 the legacy support was removed in Python 3, when backwards compatibility could be sacrificed.
In Python 2, while int
still uses __cmp__
so as not to break backwards compatibility, not all floating point numbers are less than, greater than, or equal to other floating point numbers (e.g. (float('nan') < 0.0, float('nan') == 0.0, float('nan') > 0.0)
evaluates to (False, False, False)
, so what should float('nan').__cmp__(0.0)
return?), so float
needs to use the newer rich comparison operators.
1: Try typing "import this" into a python shell.