How to filter a DataFrame column of lists for thos

2020-07-09 01:56发布

问题:

If I want to filter a column of strings for those that contain a certain term I can do so like this:

df = pd.DataFrame({'col':['ab','ac','abc']})
df[df['col'].str.contains('b')]

returns:

   col
0   ab
2  abc

How can I filter a column of lists for those that contain a certain item? For example, from

df = pd.DataFrame({'col':[['a','b'],['a','c'],['a','b','c']]})

how can I get all lists containing 'b'?

         col
0     [a, b]
2  [a, b, c]

回答1:

You can use apply, like this.

In [13]: df[df['col'].apply(lambda x: 'b' in x)]
Out[13]: 
         col
0     [a, b]
2  [a, b, c]

Although generally, storing lists in a DataFrame is a bit awkward - you might find some different representation (columns for each element in the list, MultiIndex, etc) that is easier to work with.