Python - Pandas '.isin' on a list

2020-07-17 04:52发布

问题:

I'm using Python 2.7 on Mac OSX Lion and Pandas 0.11.0 with the IPython shell.

I have a brief issue, using the data selection method .isin.

The issue is that I would like to use .isin on a list of items, so:

data = df[df[header[0]].isin(list)]

I get the following error when I do this: KeyError: u'no item named '

I generate the initial list by calling a previously developed function. I tried using eval on the list, which seems to solve an issue that comes about when using raw_input and iterating over items within it - kinda trying to work out some of the issues I've been having when transitioning to IPython and Python 2.7 (originally used Python 3.3).

I also tried iterating over the list, by first doing:

data = df[df[header[0]].isin(list[0])]

But that also returns: KeyError: u'no item named '

UPDATE: Here is the header:

 Unnamed: 0         9752  non-null values
 zipcode            9752  non-null values
 xcoord             9752  non-null values
 ycoord             9752  non-null values
 age_age5064        9752  non-null values
 age_age6574        9752  non-null values
 age_age75plus      9752  non-null values
 sex_female         9752  non-null values
 sex_male           9752  non-null values
 stage_early        9752  non-null values
 stage_late         9752  non-null values
 death_death        9752  non-null values
 death_not_death    9752  non-null values
 access             9752  non-null values
 dtypes: float64(2), int64(12)

Also, I have a function I use to get the header, which makes things easier for me, the output looks like this:

['',
  'zipcode',
  'xcoord',
  'ycoord',
 'age_age5064',
 'age_age6574',
 'age_age75plus',
 'sex_female',
 'sex_male',
 'stage_early',
 'stage_late',
 'death_death',
 'death_not_death',
 'access']

Which, actually, now that I think about it, may be what is causing the problem- although eval still would not fix it.

UPDATE 2:

So, initially, as you can see in the above .isin, I was using header[0], which was not right. I tried again using header[1], which is appropriate. I get the following error:

 TypeError: 'int' object is not iterable

I also tried the regular list again and got this error:

TypeError: int() argument must be a string or a number, not 'list'

Which, I guess, speaks more definitively to the issue....

回答1:

Try to use df.columns as your header instead:

df[df[df.columns[1]].isin(list)]