In Python, which data structure is more efficient/speedy? Assuming that order is not important to me and I would be checking for duplicates anyway, is a Python set slower than a Python list?
相关问题
- how to define constructor for Python's new Nam
- streaming md5sum of contents of a large remote tar
- How to get the background from multiple images by
- Evil ctypes hack in python
- Correctly parse PDF paragraphs with Python
I would recommend a Set implementation where the use case is limit to referencing or search for existence and Tuple implementation where the use case requires you to perform iteration. A list is a low-level implementation and requires significant memory overhead.
List performance:
Set performance:
You may want to consider Tuples as they're similar to lists but can’t be modified. They take up slightly less memory and are faster to access. They aren’t as flexible but are more efficient than lists. Their normal use is to serve as dictionary keys.
Sets are also sequence structures but with two differences from lists and tuples. Although sets do have an order, that order is arbitrary and not under the programmer’s control. The second difference is that the elements in a set must be unique.
set
by definition. [python | wiki].It depends on what you are intending to do with it.
Sets are significantly faster when it comes to determining if an object is present in the set (as in
x in s
), but are slower than lists when it comes to iterating over their contents.You can use the timeit module to see which is faster for your situation.
Lists are slightly faster than sets when you just want to iterate over the values.
Sets, however, are significantly faster than lists if you want to check if an item is contained within it. They can only contain unique items though.
It turns out tuples perform in almost exactly the same way as lists, except for their immutability.
Iterating
Determine if an object is present
Set
wins due to near instant 'contains' checks: https://en.wikipedia.org/wiki/Hash_tableList implementation: usually an array, low level close to the metal, good for iteration and random access by element index.
Set implementation: https://en.wikipedia.org/wiki/Hash_table, it does not iterate on a list, but finds the element by computing a hash from the key, so it depends on the nature of the key elements and the hash function. Similar to what is used for dict. I suspect
list
could be faster if you have very few elements (< 5), the larger element count the better theset
will perform for a contains check. It is also fast for element addition and removal.NOTE: If the
list
is already sorted, searching thelist
could be quite fast, but for usual cases aset
is faster and simpler for contains checks.