Consider the following list comprehension
[ (x,f(x)) for x in iterable if f(x) ]
This filters the iterable based a condition f
and returns the pairs of x,f(x)
. The problem with this approach is f(x)
is calculated twice.
It would be great if we could write like
[ (x,fx) for x in iterable if fx where fx = f(x) ]
or
[ (x,fx) for x in iterable if fx with f(x) as fx ]
But in python we have to write using nested comprehensions to avoid duplicate call to f(x) and it makes the comprehension look less clear
[ (x,fx) for x,fx in ( (y,f(y) for y in iterable ) if fx ]
Is there any other way to make it more pythonic and readable?
Update
Coming soon in python 3.8! PEP
# Share a subexpression between a comprehension filter clause and its output
filtered_data = [y for x in data if (y := f(x)) is not None]
You seek to have
let
-statement semantics in python list comprehensions, whose scope is available to both the___ for..in
(map) and theif ___
(filter) part of the comprehension, and whose scope depends on the..for ___ in...
.Your solution, modified: Your (as you admit unreadable) solution of
[ (x,fx) for x,fx in ( (y,f(y) for y in iterable ) if fx ]
is the most straightforward way to write the optimization.Main idea: lift x into the tuple (x,f(x)).
Some would argue the most "pythonic" way to do things would be the original
[(x,f(x)) for x in iterable if f(x)]
and accept the inefficiencies.You can however factor out the
((y,fy) for y in iterable)
into a function, if you plan to do this a lot. This is bad because if you ever wish to have access to more variables thanx,fx
(e.g.x,fx,ffx
), then you will need to rewrite all your list comprehensions. Therefore this isn't a great solution unless you know for sure you only needx,fx
and plan to reuse this pattern.Generator expression:
Main idea: use a more complicated alternative to generator expressions: one where python will let you write multiple lines.
You could just use a generator expression, which python plays nicely with:
This is how I would personally do it.
Memoization/caching:
Main idea: You could also use(abuse?) side-effects and make
f
have a global memoization cache, so you don't repeat operations.This can have a bit of overhead, and requires a policy of how large the cache should be and when it should be garbage-collected. Thus this should only be used if you'd have other uses for memoizing f, or if f is very expensive. But it would let you write...
...like you originally wanted without the performance hit of doing the expensive operations in
f
twice, even if you technically call it twice. You can add a@memoized
decorator tof
: example (without maximum cache size). This will work as long as x is hashable (e.g. a number, a tuple, a frozenset, etc.).Dummy values:
Main idea: capture fx=f(x) in a closure and modify the behavior of the list comprehension.
where filterTrue(iterable) is filter(None, iterable). You would have to modify this if your list type (a 2-tuple) was actually capable of being
None
.Map and Zip ?
Nothing says you must use comprehensions. In fact most style guides I've seen request that you limit them to simple constructs, anyway.
You could use a generator expression, instead.
There is no
where
statement but you can "emulate" it usingfor
:Execution:
As you can see, the functions is executed 5 times, not 10 or 9.
This
for
construction:imitate where clause.