This question already has an answer here:
While profiling my Python's application, I've discovered that len()
seems to be a very expensive one when using sets. See the below code:
import cProfile
def lenA(s):
for i in range(1000000):
len(s);
def lenB(s):
for i in range(1000000):
s.__len__();
def main():
s = set();
lenA(s);
lenB(s);
if __name__ == "__main__":
cProfile.run("main()","stats");
According to profiler's stats below, lenA()
seems to be 14 times slower than lenB()
:
ncalls tottime percall cumtime percall filename:lineno(function)
1 1.986 1.986 3.830 3.830 .../lentest.py:5(lenA)
1000000 1.845 0.000 1.845 0.000 {built-in method len}
1 0.273 0.273 0.273 0.273 .../lentest.py:9(lenB)
Am I missing something? Currently I use __len__()
instead of len()
, but the code looks dirty :(
This was going to be a comment but after larsman's comment on his controversial results and the result I got, I think it is interesting to add my data to the thread.
Trying more or less the same setup I got the contrary the OP got, and in the same direction commented by larsman:
The test:
This is activepython 2.6.7 64bit in win7
This is an interesting observation about the profiler, which has nothing to do with the actual performance of the len function. You see, in the profiler stats, there are two lines concerning
lenA
:...while there is only one line concerning
lenB
:The profiler has timed each single call from
lenA
tolen
, but timedlenB
as a whole. Timing a call always adds some overhead; in the case of lenA you see this overhead multiplied a million times.Obviously,
len
has some overhead, since it does a function call and translatesAttributeError
toTypeError
. Also,set.__len__
is such a simple operation that it's bound to be very fast in comparison to just about anything, but I still don't find anything like the 14x difference when usingtimeit
:You should always just call
len
, not__len__
. If the call tolen
is the bottleneck in your program, you should rethink its design, e.g. cache sizes somewhere or calculate them without callinglen
.