What is the cost of len()
function for Python built-ins? (list/tuple/string/dictionary)
问题:
回答1:
It\'s O(1) (constant time, not depending of actual length of the element - very fast) on every type you\'ve mentioned, plus set
and others such as array.array
.
回答2:
Calling len() on those data types is O(1) in CPython, the most common implementation of the Python language. Here\'s a link to a table that provides the algorithmic complexity of many different functions in CPython:
TimeComplexity Python Wiki Page
回答3:
The below measurements provide evidence that len()
is O(1) for oft-used data structures.
A note regarding timeit
: When the -s
flag is used and two strings are passed to timeit
the first string is executed only once and is not timed.
List:
$ python -m timeit -s \"l = range(10);\" \"len(l)\"
10000000 loops, best of 3: 0.0677 usec per loop
$ python -m timeit -s \"l = range(1000000);\" \"len(l)\"
10000000 loops, best of 3: 0.0688 usec per loop
Tuple:
$ python -m timeit -s \"t = (1,)*10;\" \"len(t)\"
10000000 loops, best of 3: 0.0712 usec per loop
$ python -m timeit -s \"t = (1,)*1000000;\" \"len(t)\"
10000000 loops, best of 3: 0.0699 usec per loop
String:
$ python -m timeit -s \"s = \'1\'*10;\" \"len(s)\"
10000000 loops, best of 3: 0.0713 usec per loop
$ python -m timeit -s \"s = \'1\'*1000000;\" \"len(s)\"
10000000 loops, best of 3: 0.0686 usec per loop
Dictionary (dictionary-comprehension available in 2.7+):
$ python -mtimeit -s\"d = {i:j for i,j in enumerate(range(10))};\" \"len(d)\"
10000000 loops, best of 3: 0.0711 usec per loop
$ python -mtimeit -s\"d = {i:j for i,j in enumerate(range(1000000))};\" \"len(d)\"
10000000 loops, best of 3: 0.0727 usec per loop
Array:
$ python -mtimeit -s\"import array;a=array.array(\'i\',range(10));\" \"len(a)\"
10000000 loops, best of 3: 0.0682 usec per loop
$ python -mtimeit -s\"import array;a=array.array(\'i\',range(1000000));\" \"len(a)\"
10000000 loops, best of 3: 0.0753 usec per loop
Set (set-comprehension available in 2.7+):
$ python -mtimeit -s\"s = {i for i in range(10)};\" \"len(s)\"
10000000 loops, best of 3: 0.0754 usec per loop
$ python -mtimeit -s\"s = {i for i in range(1000000)};\" \"len(s)\"
10000000 loops, best of 3: 0.0713 usec per loop
Deque:
$ python -mtimeit -s\"from collections import deque;d=deque(range(10));\" \"len(d)\"
100000000 loops, best of 3: 0.0163 usec per loop
$ python -mtimeit -s\"from collections import deque;d=deque(range(1000000));\" \"len(d)\"
100000000 loops, best of 3: 0.0163 usec per loop
回答4:
All those objects keep track of their own length. The time to extract the length is small (O(1) in big-O notation) and mostly consists of [rough description, written in Python terms, not C terms]: look up \"len\" in a dictionary and dispatch it to the built_in len function which will look up the object\'s __len__
method and call that ... all it has to do is return self.length
回答5:
len is an O(1) because in your RAM, lists are stored as tables (series of contiguous addresses). To know when the table stops the computer needs two things : length and start point. That is why len() is a O(1), the computer stores the value, so it just needs to look it up.