How to filter in NaN (pandas)?

2019-01-17 20:21发布

问题:

I have a pandas dataframe (df), and I want to do something like:

newdf = df[(df.var1 == 'a') & (df.var2 == NaN)]

I've tried replacing NaN with np.NaN, or 'NaN' or 'nan' etc, but nothing evaluates to True. There's no pd.NaN.

I can use df.fillna(np.nan) before evaluating the above expression but that feels hackish and I wonder if it will interfere with other pandas operations that rely on being able to identify pandas-format NaN's later.

I get the feeling there should be an easy answer to this question, but somehow it has eluded me. Any advice is appreciated. Thank you.

回答1:

This doesn't work because NaN isn't equal to anything, including NaN. Use pd.isnull(df.var2) instead.



回答2:

Simplest of all solutions:

filtered_df = df[df['var2'].isnull()]

This filters and gives you rows which has only NaN values in 'var2' column.



回答3:

Pandas uses numpy's NaN value. Use numpy.isnan to obtain a Boolean vector from a pandas series.