pandas + dataframe - select by partial string

2019-01-01 07:42发布

I have a DataFrame with 4 columns of which 2 contain string values. I was wondering if there was a way to select rows based on a partial string match against a particular column?

In other words, a function or lambda function that would do something like

re.search(pattern, cell_in_question) 

returning a boolean. I am familiar with the syntax of df[df['A'] == "hello world"] but can't seem to find a way to do the same with a partial string match say 'hello'.

Would someone be able to point me in the right direction?

标签: python pandas
8条回答
临风纵饮
2楼-- · 2019-01-01 08:35

I am using pandas 0.14.1 on macos in ipython notebook. I tried the proposed line above:

df[df['A'].str.contains("Hello|Britain")]

and got an error:

"cannot index with vector containing NA / NaN values"

but it worked perfectly when an "==True" condition was added, like this:

df[df['A'].str.contains("Hello|Britain")==True]
查看更多
残风、尘缘若梦
3楼-- · 2019-01-01 08:40

Quick note: if you want to do selection based on a partial string contained in the index, try the following:

df['stridx']=df.index
df[df['stridx'].str.contains("Hello|Britain")]
查看更多
登录 后发表回答