Counting the number of True Booleans in a Python L

2019-01-17 19:04发布

站内文章 / Python

49 0

在下西门庆

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I have a list of Booleans:

[True, True, False, False, False, True]

and I am looking for a way to count the number of True in the list (so in the example above, I want the return to be 3.) I have found examples of looking for the number of occurrences of specific elements, but is there a more efficient way to do it since I'm working with Booleans? I'm thinking of something analogous to all or any.

回答1:

True is equal to 1.

>>> sum([True, True, False, False, False, True])
3

回答2:

list has a count method:

>>> [True,True,False].count(True)
2

This is actually more efficient than sum, as well as being more explicit about the intent, so there's no reason to use sum:

In [1]: import random

In [2]: x = [random.choice([True, False]) for i in range(100)]

In [3]: %timeit x.count(True)
970 ns ± 41.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [4]: %timeit sum(x)
1.72 µs ± 161 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

回答3:

If you are only concerned with the constant True, a simple sum is fine. However, keep in mind that in Python other values evaluate as True as well. A more robust solution would be to use the bool builtin:

>>> l = [1, 2, True, False]
>>> sum(bool(x) for x in l)
3

UPDATE: Here's another similarly robust solution that has the advantage of being more transparent:

>>> sum(1 for x in l if x)
3

P.S. Python trivia: True could be true without being 1. Warning: do not try this at work!

>>> True = 2
>>> if True: print('true')
... 
true
>>> l = [True, True, False, True]
>>> sum(l)
6
>>> sum(bool(x) for x in l)
3
>>> sum(1 for x in l if x)
3

Much more evil:

True = False

回答4:

You can use sum():

>>> sum([True, True, False, False, False, True])
3

回答5:

Just for completeness' sake (sum is usually preferable), I wanted to mention that we can also use filter to get the truthy values. In the usual case, filter accepts a function as the first argument, but if you pass it None, it will filter for all "truthy" values. This feature is somewhat surprising, but is well documented and works in both Python 2 and 3.

The difference between the versions, is that in Python 2 filter returns a list, so we can use len:

>>> bool_list = [True, True, False, False, False, True]
>>> filter(None, bool_list)
[True, True, True]
>>> len(filter(None, bool_list))
3

But in Python 3, filter returns an iterator, so we can't use len, and if we want to avoid using sum (for any reason) we need to resort to converting the iterator to a list (which makes this much less pretty):

>>> bool_list = [True, True, False, False, False, True]
>>> filter(None, bool_list)
<builtins.filter at 0x7f64feba5710>
>>> list(filter(None, bool_list))
[True, True, True]
>>> len(list(filter(None, bool_list)))
3

回答6:

It is safer to run through bool first. This is easily done:

>>> sum(map(bool,[True, True, False, False, False, True]))
3

Then you will catch everything that Python considers True or False into the appropriate bucket:

>>> allTrue=[True, not False, True+1,'0', ' ', 1, [0], {0:0}, set([0])]
>>> list(map(bool,allTrue))
[True, True, True, True, True, True, True, True, True]

If you prefer, you can use a comprehension:

>>> allFalse=['',[],{},False,0,set(),(), not True, True-1]
>>> [bool(i) for i in allFalse]
[False, False, False, False, False, False, False, False, False]

回答7:

I prefer len([b for b in boollist if b is True]) (or the generator-expression equivalent), as it's quite self-explanatory. Less 'magical' than the answer proposed by Ignacio Vazquez-Abrams.

Alternatively, you can do this, which still assumes that bool is convertable to int, but makes no assumptions about the value of True: ntrue = sum(boollist) / int(True)

回答8:

After reading all the answers and comments on this question, I thought to do a small experiment.

I generated 50,000 random booleans and called sum and count on them.

Here are my results:

>>> a = [bool(random.getrandbits(1)) for x in range(50000)]
>>> len(a)
50000
>>> a.count(False)
24884
>>> a.count(True)
25116
>>> def count_it(a):
...   curr = time.time()
...   counting = a.count(True)
...   print("Count it = " + str(time.time() - curr))
...   return counting
... 
>>> def sum_it(a):
...   curr = time.time()
...   counting = sum(a)
...   print("Sum it = " + str(time.time() - curr))
...   return counting
... 
>>> count_it(a)
Count it = 0.00121307373046875
25015
>>> sum_it(a)
Sum it = 0.004102230072021484
25015

Just to be sure, I repeated it several more times:

>>> count_it(a)
Count it = 0.0013530254364013672
25015
>>> count_it(a)
Count it = 0.0014507770538330078
25015
>>> count_it(a)
Count it = 0.0013344287872314453
25015
>>> sum_it(a)
Sum it = 0.003480195999145508
25015
>>> sum_it(a)
Sum it = 0.0035257339477539062
25015
>>> sum_it(a)
Sum it = 0.003350496292114258
25015
>>> sum_it(a)
Sum it = 0.003744363784790039
25015

And as you can see, count is 3 times faster than sum. So I would suggest to use count as I did in count_it.

Python version: 3.6.7
CPU cores: 4
RAM size: 16 GB
OS: Ubuntu 18.04.1 LTS