I am wondering why repr(int)
is faster than str(int)
. With the following code snippet:
ROUNDS = 10000
def concat_strings_str():
return ''.join(map(str, range(ROUNDS)))
def concat_strings_repr():
return ''.join(map(repr, range(ROUNDS)))
%timeit concat_strings_str()
%timeit concat_strings_repr()
I get these timings (python 3.5.2, but very similar results with 2.7.12):
1.9 ms ± 17.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
1.38 ms ± 9.07 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
If I'm on the right path, the same function long_to_decimal_string
is getting called below the hood.
Did I get something wrong or what else is going on that I am missing?
update:
This probably has nothing to with int
's __repr__
or __str__
methods but with the differences between repr()
and str()
, as int.__str__
and int.__repr__
are in fact comparably fast:
def concat_strings_str():
return ''.join([one.__str__() for one in range(ROUNDS)])
def concat_strings_repr():
return ''.join([one.__repr__() for one in range(ROUNDS)])
%timeit concat_strings_str()
%timeit concat_strings_repr()
results in:
2.02 ms ± 24.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
2.05 ms ± 7.07 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
There are several possibilities because the CPython functions that are responsible for the
str
andrepr
return are slightly different.But I guess the primary reason is that
str
is atype
(a class) and thestr.__new__
method has to call__str__
whilerepr
can directly go to__repr__
.Because using
str(obj)
must first go throughtype.__call__
thenstr.__new__
(create a new string) thenPyObject_Str
(make a string out of the object) which invokesint.__str__
and, finally, uses the function you linked.repr(obj)
, which corresponds tobuiltin_repr
, directly callsPyObject_Repr
(get the object repr) which then callsint.__repr__
which uses the same function asint.__str__
.Additionally, the path they take through
call_function
(the function that handles theCALL_FUNCTION
opcode that's generated for calls) is slightly different.From the master branch on GitHub (CPython 3.7):
str
goes through_PyObject_FastCallKeywords
(which is the one that callstype.__call__
). Apart from performing more checks, this also needs to create a tuple to hold the positional arguments (see_PyStack_AsTuple
).repr
goes through_PyCFunction_FastCallKeywords
which calls_PyMethodDef_RawFastCallKeywords
.repr
is also lucky because, since it only accepts a single argument (the switch leads it to theMETH_0
case in_PyMethodDef_RawFastCallKeywords
) there's no need to create a tuple, just indexing of the args.As your update states, this isn't about
int.__repr__
vsint.__str__
, they are the same function after all; it's all about howrepr
andstr
reach them.str
just needs to work a bit harder.I just compared the
str
andrepr
implementations in the 3.5 branch. See here.There seems to be more checks in
str
: