I am currently in a personal learning project where I read in an XML database. I find myself writing functions that gather data and I'm not sure what would be a fast way to return them.
Which is generally faster:
yield
s, or- several
append()
s within the function thenreturn
the ensuinglist
?
I would be happy to know in what situations where yield
s would be faster than append()
s or vice-versa.
I recently asked myself a similar question exploring ways of generating all permutations of a list (or tuple) either via appending to a list or via a generator, and found (for permutations of length 9, which take about a second or so to generate):
itertools.permutations
yield
) reduces this by approx. 20 %itertools.permutations
.Take with a grain of salt! Timing and profiling was very useful:
Primally u must decide, if u need generator, this also got improved method. Like list generator "[elem for elem in somethink]". And generators be recommended if u just use value in list for some operations. But if u need list for many changes, and work with many elements at the same time, this must be list. (Like 70% times if standard programmer use list, better will be generator. use less memory, just many people just doesn't see other way of list. Unfortunately at our epoch, many people pee at good optymalization, and do just to work.)
If u use generator for list to improve return, let's do that same with yield guys. Anyway, we got multiple more optimized methods for all actions in Python programming language.
Yield is faster then return, and I'll prove this. Just check this guys:
Of course appending will be slower then other ideas, becouse we create and extend list any loop time. Just loop "for" is very unoptymalized, if u can avoid this, do that. Becouse at any step this function load next element and write our variable, to got this object value in memory. So we jump at any element, create reference, extend list in loop (declared method is huge speed optymalizer), when we generate just return, summary got 2000 elements at two lists.
list_gen is less memorywise, we just return elements, but like up, we generate secound list. Now we got two lists, orginal data, and her copy. Summary 2000 elements. There just we avoid step with create reference to variable. Becouse our gen in lists avoid this step. Just write elements.
yielder use least of all memory, becouse we got just yielded value from data. We avoid one reference. For example:
Use only one element to some instruction, not all from list, next one value yielder will return at next loop, not magazine all 1000 elements to write in reference.
Srry for little dig out topic, just when i accidentally came a cross from google search, other beginner python programmers can see this nonsense.
yield
has the huge advantage of being lazy and speed is usually not the best reason to use it. But if it works in your context, then there is no reason not to use it:This is the result:
At least in this very simple test,
yield
is faster than append.There is a even faster alternative to TH4Ck's yielding(). It is list comprehension.
Of course it is rather silly to micro-benchmark these operations without knowing the structure of your code. Each of them are useful in difference situation. For example list comprehension is useful if you want to apply a simple operation that can be express as an single expression. Yield has a significant advantage for you to isolate the traversal code into a generator method. Which one is appropriate depends a lot on the usage.