A reasonably common operation is to filter one list
based on another list
. People quickly find that this:
[x for x in list_1 if x in list_2]
is slow for large inputs - it's O(n*m). Yuck. How do we speed this up? Use a set
to make filtering lookups O(1):
s = set(list_2)
[x for x in list_1 if x in s]
This gives nice overall O(n) behavior. I however often see even veteran coders fall into The Trap™:
[x for x in list_1 if x in set(list_2)]
Ack! This is again O(n*m) since python builds set(list_2)
every time, not just once.
I thought that was the end of the story - python can't optimize it away to only build the set
once. Just be aware of the pitfall. Gotta live with it. Hmm.
#python 3.3.2+
list_2 = list(range(20)) #small for demonstration purposes
s = set(list_2)
list_1 = list(range(100000))
def f():
return [x for x in list_1 if x in s]
def g():
return [x for x in list_1 if x in set(list_2)]
def h():
return [x for x in list_1 if x in {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19}]
%timeit f()
100 loops, best of 3: 7.31 ms per loop
%timeit g()
10 loops, best of 3: 77.4 ms per loop
%timeit h()
100 loops, best of 3: 6.66 ms per loop
Huh, python (3.3) can optimize away a set literal. It's even faster than f()
in this case, presumably because it gets to replace a LOAD_GLOBAL
with a LOAD_FAST
.
#python 2.7.5+
%timeit h()
10 loops, best of 3: 72.5 ms per loop
Python 2 notably doesn't do this optimization. I've tried investigating further what python3 is doing but unfortunately dis.dis
cannot probe the innards of comprehension expressions. Basically everything interesting turns into MAKE_FUNCTION
.
So now I'm wondering - why can python 3.x optimize away the set literal to only build once, but not set(list_2)
?
From What’s New In Python 3.2:
No one's mentioned this issue yet: how do you know
set([1,2,3])
and{1, 2, 3}
are the same thing?You can't shadow a literal; you can shadow
set
. So before you can consider hoisting, you need to know not just thatlist1
isn't being affected, you need to be sure thatset
is what you think it is. Sometimes you can do that, either under restrictive conditions at compile time or more conveniently at runtime, but it's definitely nontrivial.It's kind of funny: often when the suggestion of doing optimizations like this comes up, one pushback is that as nice as they are, it makes it harder to reason about what Python performance is going to be like, even algorithmically. Your question provides some evidence for this objection.
In order to optimize
set(list_2)
, the interpreter needs to prove thatlist_2
(and all of its elements) does not change between iterations. This is a hard problem in the general case, and it would not surprise me if the interpreter does not even attempt to tackle it.On the other hand a set literal cannot change its value between iterations, so the optimization is known to be safe.
The basic reason is that a literal really can't change, whereas if it's an expression like
set(list_2)
, it's possible that evaluating the target expression or the iterable of the comprehension could change the value ofset(list_2)
. For instance, if you haveIt is possible that
f
modifieslist_2
.Even for a simple
[x for x in blah ...]
expression, it's theoretically possible that the__iter__
method ofblah
could modifylist_2
.I would imagine there is some scope for optimizations, but the current behavior keeps things simpler. If you start adding optimizations for things like "it is only evaluated once if the target expression is a single bare name and the iterable is a builtin list or dict..." you make it much more complicated to figure out what will happen in any given situation.
Too long for a comment
This won't speak to the optimization details or v2 vs. v3 differences. But when I encounter this in some situations, I find making a context manager out of the data object is useful:
Using this I see:
and in some cases, it provides a nice stop-gap between creating the object before the comprehension vs. creating it within the comprehension, and allows custom tear-down code if you want it.
A more generic version can be made using
contextlib.contextmanager
. Here's a quick-and-dirty version of what I mean.Then one can do:
or just as easily