Python: Deleting rows from dataframe for which val

2019-08-03 03:14发布

I have a csv file(or dataframe) like below :

Text    Location    State
A   Florida, USA    Florida
B   NY              New York
C       
D   abc 

And a dictionary with key value pair as :

stat_map = {
        'FL': 'Florida',
        'NY': 'New York',
        'AR': 'Arkansas',
}

How may I delete row 3rd and 4th i.e. row with Text C & D so that my dataframe contains only those rows for which i have value in dictionary. All rows for which state is either blank or has some value which is not in dictionary value should be deleted. The final output should look like :

Text    Location    State
    A   Florida, USA    Florida
    B   NY              New York

Please help.

1条回答
够拽才男人
2楼-- · 2019-08-03 03:25

Use extract + replace, last remove rows by dropna:

stat_map = {
        'FL': 'Florida',
        'NY': 'New York',
        'AR': 'Arkansas',
}

#get list from all values from keys and values of dict
L = list(stat_map.keys()) + list(stat_map.values())
print (L)
['NY', 'FL', 'AR', 'New York', 'Florida', 'Arkansas']


df['State1'] = df['Location'].str.extract('(' + '|'.join(L) + ')', expand=False)
                             .replace(stat_map)
df = df.dropna(subset=['State1'])
print (df)
  Text      Location     State    State1
0    A  Florida, USA   Florida   Florida
1    B            NY  New York  New York
查看更多
登录 后发表回答