How to use sets in Python to find list membership?

2019-06-03 08:37发布

Given:

A = [['Yes', 'lala', 'No'], ['Yes', 'lala', 'Idontknow'], ['No', 'lala', 'Yes'], ['No', 'lala', 'Idontknow']]

I want to know if ['Yes', X, 'No'] exist within A, where X is anything I don't care.

I attempted:

valid = False
for n in A:
    if n[0] == 'Yes' and n[2] == 'No':
        valid = True

I know set() is useful in this type of situations. But how can this be done? Is this possible? Or is it better for me to stick with my original code?

标签: python set
4条回答
看我几分像从前
2楼-- · 2019-06-03 08:42

if you want check for existance you can just ['Yes', 'No'] in A:

In [1]: A = [['Yes', 'No'], ['Yes', 'Idontknow'], ['No', 'Yes'], ['No', 'Idontknow']]

In [2]: ['Yes', 'No'] in A
Out[2]: True

for the next case try:

In [3]: A = [['Yes', 'lala', 'No'], ['Yes', 'lala', 'Idontknow'], ['No', 'lala', 'Yes'], ['No', 'lala', 'Idontknow']]

In [4]: any(i[0]=='Yes' and i[2] == 'No' for i in A)
Out[4]: True

or you can possibly define a little func:

In [5]: def want_to_know(l,item):
   ...:     for i in l:
   ...:         if i[0] == item[0] and i[2] == item[2]:
   ...:             return True
   ...:     return False

In [6]: want_to_know(A,['Yes', 'xxx', 'No'])
Out[6]: True

any(i[0]=='Yes' and i[2] == 'No' for i in A*10000) actually seems to be the 10 times faster than than the conversion itself.

In [8]: %timeit any({(x[0],x[-1]) == ('Yes','No') for x in A*10000})
100 loops, best of 3: 14 ms per loop

In [9]: % timeit {tuple([x[0],x[-1]]) for x in A*10000}
10 loops, best of 3: 33.4 ms per loop

In [10]: %timeit any(i[0]=='Yes' and i[2] == 'No' for i in A*10000)
1000 loops, best of 3: 334 us per loop
查看更多
Ridiculous、
3楼-- · 2019-06-03 08:42

Following is how to do it using Set().

>>> A = Set([('Yes', 'No'), ('Yes', 'Idontknow'), ('No', 'Yes'), ('No', 'Idontknow')])
>>> ('Yes','No') in A
True
>>> 

The elements of Set should be hashable.. so I have used tuples as Set elements and not lists.

查看更多
Viruses.
4楼-- · 2019-06-03 08:56

Set doesn't support list, you can convert it into tuple,

A = [['Yes', 'No'], ['Yes', 'Idontknow'], ['No', 'Yes'], ['No', 'Idontknow']]
valid = ('Yes', 'No') in {tuple(item) for item in A}

and as @IgnacioVazquez-Abrams mentioned, the conversion from list to tuple is O(n), so if you are aware of performance, you need to choose other methods.

查看更多
Anthone
5楼-- · 2019-06-03 08:58

Convert your list to set first, because it will improve the look up time from O(n) to O(1):

In [27]: A = [['Yes', 'No'], ['Yes', 'Idontknow'], ['No', 'Yes'], ['No', 'Idontknow']]

In [28]: s=set(tuple(map(tuple,A)))

In [29]: s
Out[29]: set([('Yes', 'No'), ('No', 'Idontknow'), ('Yes', 'Idontknow'), ('No', 'Yes')])

In [30]: ('Yes', 'No') in s
Out[30]: True

timeit comparisions:

%timeit ['Yes', 'No'] in A
1000000 loops, best of 3: 504 ns per loop  

%timeit ('Yes', 'No') in s
1000000 loops, best of 3: 442 ns per loop       #winner

%timeit ['No', 'Idontknow'] in A
1000000 loops, best of 3: 861 ns per loop

%timeit ('No', 'Idontknow') in s
1000000 loops, best of 3: 461 ns per loop       #winner

Edit:

If you're only interested in first and last element:

In [69]: A = [['Yes', 'No'], ['Yes', 'Idontknow','hmmm'], ['No', 'Yes'], ['No', 'Idontknow']]

In [70]: s={tuple([x[0],x[-1]]) for x in A} # -1 or 2, change as per your requirement
                                         #or set(tuple([x[0],x[-1]]) for x in A)


In [71]: s
Out[71]: set([('Yes', 'No'), ('Yes', 'hmmm'), ('No', 'Idontknow'), ('No', 'Yes')])

In [73]: ('Yes', 'hmmm') in s
Out[73]: True

timeit comparison with any() :

In [77]: %timeit ('Yes', 'hmmm') in s
1000000 loops, best of 3: 428 ns per loop      #winner

In [78]: %timeit any(x[0]=="Yes" and x[-1]=="hmmm" for x in A)
100000 loops, best of 3: 2.87 us per loop
查看更多
登录 后发表回答