Search Pandas Column for Substring in other Column

2019-05-03 11:31发布

I have an example .csv, imported as df.csv, as follows:

    Ethnicity, Description
  0 French, Irish Dance Company
  1 Italian, Moroccan/Algerian
  2 Danish, Company in Netherlands
  3 Dutch, French
  4 English, EnglishFrench
  5 Irish, Irish-American

I'd like to check the pandas test1['Description'] for strings in test1['Ethnicity']. This should return rows 0, 3, 4, and 5 as the description strings contain strings in the ethnicity column.

So far I've tried:

df[df['Ethnicity'].str.contains('French')]['Description']

This returns any specific string, but I'd like to iterate through without searching for each specific ethnicity value. I've also tried converting the columns to lists and iterating through but can't seem to find a way to return the row, as it is no long a DataFrame().

Thank you in advance!

2条回答
小情绪 Triste *
2楼-- · 2019-05-03 12:05

the ever popular double apply:

df[df.Description.apply(lambda x: df.Ethnicity.apply(lambda y: y in x)).any(1)]

  Ethnicity          Description
0    French  Irish Dance Company
3     Dutch               French
4   English        EnglishFrench
5     Irish       Irish-American

Timing

jezrael's answer is far superior

enter image description here

查看更多
狗以群分
3楼-- · 2019-05-03 12:09

You can use str.contains with values in column Ethnicity converted tolist and then join by | what is in regex or:

print ('|'.join(df.Ethnicity.tolist()))
French|Italian|Danish|Dutch|English|Irish

mask = df.Description.str.contains('|'.join(df.Ethnicity.tolist()))
print (mask)
0     True
1    False
2    False
3     True
4     True
5     True
Name: Description, dtype: bool

#boolean-indexing
print (df[mask])
  Ethnicity          Description
0    French  Irish Dance Company
3     Dutch               French
4   English        EnglishFrench
5     Irish       Irish-American

It looks like you can omit tolist():

print (df[df.Description.str.contains('|'.join(df.Ethnicity))])
  Ethnicity          Description
0    French  Irish Dance Company
3     Dutch               French
4   English        EnglishFrench
5     Irish       Irish-American
查看更多
登录 后发表回答