In python, when changing a purely recursive function into a recursive generator (not a plain generator) the performance seems to be degrading.
For example, here is a performance comparison between two functions which find all combinations of a list:
from datetime import datetime as dt
def rec_subsets(ms, i=0, s=[]):
if i == len(ms):
# do something with s
return
rec_subsets(ms, i+1, s)
rec_subsets(ms, i+1, s + [ms[i]])
def gen_subsets(ms, i=0, s=[]):
if i == len(ms):
yield s
return
for a in gen_subsets(ms, i+1, s): yield a
for a in gen_subsets(ms, i+1, s + [ms[i]]): yield a
t1 = dt.now()
rec_subsets(range(20))
t2 = dt.now()
print t2 - t1
t1 = dt.now()
for _ in gen_subsets(range(20)): pass
t2 = dt.now()
print t2 - t1
with the following output:
0:00:01.027000 # rec_subsets
0:00:02.860000 # gen_subsets
One would naturally expect gen_subsets to be approximately as fast as rec_subsets but this is not the case, it is much slower.
Is this normal or am I missing something?
rec_subsets()
is still faster (forrange(20)
) even ifresult.append(s)
is added inplace of# do something with s
and the results of bothgen_subsets()
andrec_subsets()
are consumed.It might be explained by the following quote from PEP 380 (
yield from
syntax support):You could generate a powerset using
itertools.combinations()
:It is faster for
range(20)
on my machine:To reproduce the results, run
time-subsets.py
.