可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
This is my dataframe:
date ids
0 2011-04-23 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,...
1 2011-04-24 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,...
2 2011-04-25 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,...
3 2011-04-26 Nan
4 2011-04-27 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,...
5 2011-04-28 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,...
I want to replace Nan
with []. How to do that? Fillna([]) did not work. I even tried replace(np.nan, [])
but it gives error:
TypeError('Invalid "to_replace" type: \'float\'',)
回答1:
You can first use loc
to locate all rows that have a nan
in the ids
column, and then loop through these rows using at
to set their values to an empty list:
for row in df.loc[df.ids.isnull(), 'ids'].index:
df.at[row, 'ids'] = []
>>> df
date ids
0 2011-04-23 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
1 2011-04-24 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
2 2011-04-25 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
3 2011-04-26 []
4 2011-04-27 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
5 2011-04-28 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
回答2:
My approach is similar to @hellpanderrr's, but instead tests for list-ness rather than using isnan
:
df['ids'] = df['ids'].apply(lambda d: d if isinstance(d, list) else [])
I originally tried using pd.isnull
(or pd.notnull
) but, when given a list, that returns the null-ness of each element.
回答3:
After a lot of head-scratching I found this method that should be the most efficient (no looping, no apply), just assigning to a slice:
isnull = df.ids.isnull()
df.loc[isnull, 'ids'] = [ [[]] * isnull.sum() ]
The trick was to construct your list of []
of the right size (isnull.sum()
), and then enclose it in a list: the value you are assigning is a 2D array (1 column, isnull.sum()
rows) containing empty lists as elements.
回答4:
Without assignments:
1) Assuming we have only floats and integers in our dataframe
import math
df.apply(lambda x:x.apply(lambda x:[] if math.isnan(x) else x))
2) For any dataframe
import math
def isnan(x):
if isinstance(x, (int, long, float, complex)) and math.isnan(x):
return True
df.apply(lambda x:x.apply(lambda x:[] if isnan(x) else x))
回答5:
Create a function that checks your condition, if not, it returns an empty list/empty set etc.
Then apply that function to the variable, but also assigning the new calculated variable to the old one or to a new variable if you wish.
aa=pd.DataFrame({'d':[1,1,2,3,3,np.NaN],'r':[3,5,5,5,5,'e']})
def check_condition(x):
if x>0:
return x
else:
return list()
aa['d]=aa.d.apply(lambda x:check_condition(x))
回答6:
list
is not supported in fillna
method, but you can use dict
instead.
df.fillna({})