Why list comprehension can be faster than map() in

2019-04-11 12:45发布

I am looking in to the performance issues of the loop like structures in Python and found the following statements:

Besides the syntactic benefit of list comprehensions, they are often as fast or faster than equivalent use of map. (Performance Tips)

List comprehensions run a bit faster than equivalent for-loops (unless you're just going to throw away the result). (Python Speed)

I am wondering what difference under the hood gives list comprehension this advantage. Thanks.

1条回答
爷、活的狠高调
2楼-- · 2019-04-11 12:52

Test one: throwing away the result.

Here's our dummy function:

def examplefunc(x):
    pass

And here are our challengers:

def listcomp_throwaway():
    [examplefunc(i) for i in range(100)]

def forloop_throwaway():
    for i in range(100):
        examplefunc(i)

I won't do an analysis of its raw speed, only why, per the OP's question. Lets take a look at the diffs of the machine code.

--- List comprehension
+++ For loop
@@ -1,15 +1,16 @@
- 55           0 BUILD_LIST               0
+ 59           0 SETUP_LOOP              30 (to 33)
               3 LOAD_GLOBAL              0 (range)
               6 LOAD_CONST               1 (100)
               9 CALL_FUNCTION            1
              12 GET_ITER            
-        >>   13 FOR_ITER                18 (to 34)
+        >>   13 FOR_ITER                16 (to 32)
              16 STORE_FAST               0 (i)
-             19 LOAD_GLOBAL              1 (examplefunc)
+
+ 60          19 LOAD_GLOBAL              1 (examplefunc)
              22 LOAD_FAST                0 (i)
              25 CALL_FUNCTION            1
-             28 LIST_APPEND              2
-             31 JUMP_ABSOLUTE           13
-        >>   34 POP_TOP             
-             35 LOAD_CONST               0 (None)
-             38 RETURN_VALUE        
+             28 POP_TOP             
+             29 JUMP_ABSOLUTE           13
+        >>   32 POP_BLOCK           
+        >>   33 LOAD_CONST               0 (None)
+             36 RETURN_VALUE     

The race is on. Listcomp's first move is to build an empty list, while for loop's is to setup a loop. Both of them then proceed to load global range(), the constant 100, and call the range function for a generator. Then they both get the current iterator and get the next item, and store it into the variable i. Then they load examplefunc and i and call examplefunc. Listcomp appends it to the list and starts the loop over again. For loop does the same in three instructions instead of two. Then they both load None and return it.

So who seems better in this analysis? Here, list comprehension does some redundant operations such as building the list and appending to it, if you don't care about the result. For loop is pretty efficient too.

If you time them, using a for loop is about one-third faster than a list comprehension. (In this test, examplefunc divided its argument by five and threw it away instead of doing nothing at all.)

Test two: Keeping the result like normal.

No dummy function this test. So here are our challengers:

def listcomp_normal():
    l = [i*5 for i in range(100)]


def forloop_normal():
    l = []
    for i in range(100):
        l.append(i*5)

The diff isn't any use to us today. It's just the two machine codes in two blocks.

List comp's machine code:

 55           0 BUILD_LIST               0
              3 LOAD_GLOBAL              0 (range)
              6 LOAD_CONST               1 (100)
              9 CALL_FUNCTION            1
             12 GET_ITER            
        >>   13 FOR_ITER                16 (to 32)
             16 STORE_FAST               0 (i)
             19 LOAD_FAST                0 (i)
             22 LOAD_CONST               2 (5)
             25 BINARY_MULTIPLY     
             26 LIST_APPEND              2
             29 JUMP_ABSOLUTE           13
        >>   32 STORE_FAST               1 (l)
             35 LOAD_CONST               0 (None)
             38 RETURN_VALUE        

For loop's machine code:

 59           0 BUILD_LIST               0
              3 STORE_FAST               0 (l)

 60           6 SETUP_LOOP              37 (to 46)
              9 LOAD_GLOBAL              0 (range)
             12 LOAD_CONST               1 (100)
             15 CALL_FUNCTION            1
             18 GET_ITER            
        >>   19 FOR_ITER                23 (to 45)
             22 STORE_FAST               1 (i)

 61          25 LOAD_FAST                0 (l)
             28 LOAD_ATTR                1 (append)
             31 LOAD_FAST                1 (i)
             34 LOAD_CONST               2 (5)
             37 BINARY_MULTIPLY     
             38 CALL_FUNCTION            1
             41 POP_TOP             
             42 JUMP_ABSOLUTE           19
        >>   45 POP_BLOCK           
        >>   46 LOAD_CONST               0 (None)
             49 RETURN_VALUE        

As you can probably already tell, the list comprehension has fewer instructions than for loop does.

List comprehension's checklist:

  1. Build an anonymous empty list.
  2. Load range.
  3. Load 100.
  4. Call range.
  5. Get the iterator.
  6. Get the next item on that iterator.
  7. Store that item onto i.
  8. Load i.
  9. Load the integer five.
  10. Multiply times five.
  11. Append the list.
  12. Repeat steps 6-10 until range is empty.
  13. Point l to the anonymous empty list.

For loop's checklist:

  1. Build an anonymous empty list.
  2. Point l to the anonymous empty list.
  3. Setup a loop.
  4. Load range.
  5. Load 100.
  6. Call range.
  7. Get the iterator.
  8. Get the next item on that iterator.
  9. Store that item onto i.
  10. Load the list l.
  11. Load the attribute append on that list.
  12. Load i.
  13. Load the integer five.
  14. Multiply times five.
  15. Call append.
  16. Go to the top.
  17. Go to absolute.

(Not including these steps: Load None, return it.)

The list comprehension doesn't have to do these things:

  • Load append of the list every time, since it's pre-bound as a local variable.
  • Load i twice per loop
  • Spend two instructions going to the top
  • Directly append to the list instead of calling a wrapper that appens the list

In conclusion, listcomp is a lot faster if you are going to use the values, but if you don't it's pretty slow.

Real speeds

Test one: for loop is faster by about one-third*

Test two: list comprehension is faster by about two-thirds*

*About -> second decimal place acurrate

查看更多
登录 后发表回答