可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

This question already has an answer here:

Why is Python's 'len' function faster than the __len__ method? 2 answers

While profiling my Python's application, I've discovered that len() seems to be a very expensive one when using sets. See the below code:

import cProfile

def lenA(s):
    for i in range(1000000):
        len(s);

def lenB(s):
    for i in range(1000000):
        s.__len__();

def main():
    s = set();
    lenA(s);
    lenB(s);

if __name__ == "__main__":
    cProfile.run("main()","stats");

According to profiler's stats below, lenA() seems to be 14 times slower than lenB():

 ncalls  tottime  percall  cumtime  percall  filename:lineno(function)
      1    1.986    1.986    3.830    3.830  .../lentest.py:5(lenA)
1000000    1.845    0.000    1.845    0.000  {built-in method len}
      1    0.273    0.273    0.273    0.273  .../lentest.py:9(lenB)

Am I missing something? Currently I use __len__() instead of len(), but the code looks dirty :(

回答1:

Obviously, len has some overhead, since it does a function call and translates AttributeError to TypeError. Also, set.__len__ is such a simple operation that it's bound to be very fast in comparison to just about anything, but I still don't find anything like the 14x difference when using timeit:

In [1]: s = set()

In [2]: %timeit s.__len__()
1000000 loops, best of 3: 197 ns per loop

In [3]: %timeit len(s)
10000000 loops, best of 3: 130 ns per loop

You should always just call len, not __len__. If the call to len is the bottleneck in your program, you should rethink its design, e.g. cache sizes somewhere or calculate them without calling len.

回答2:

This is an interesting observation about the profiler, which has nothing to do with the actual performance of the len function. You see, in the profiler stats, there are two lines concerning lenA:

 ncalls  tottime  percall  cumtime  percall  filename:lineno(function)
      1    1.986    1.986    3.830    3.830  .../lentest.py:5(lenA)
1000000    1.845    0.000    1.845    0.000  {built-in method len}

...while there is only one line concerning lenB:

      1    0.273    0.273    0.273    0.273  .../lentest.py:9(lenB)

The profiler has timed each single call from lenA to len, but timed lenB as a whole. Timing a call always adds some overhead; in the case of lenA you see this overhead multiplied a million times.

回答3:

This was going to be a comment but after larsman's comment on his controversial results and the result I got, I think it is interesting to add my data to the thread.

Trying more or less the same setup I got the contrary the OP got, and in the same direction commented by larsman:

12.1964105975   <- __len__
6.22144670823   <- len()

C:\Python26\programas>

The test:

def lenA(s):
    for i in range(100):
        len(s);

def lenB(s):
    for i in range(100):
        s.__len__();

s = set()

if __name__ == "__main__":

    from timeit import timeit
    print timeit("lenB(s)", setup="from __main__ import lenB, s")
    print timeit("lenA(s)", setup="from __main__ import lenA, s")

This is activepython 2.6.7 64bit in win7