pandas.to_dict returns None mixed with nan

2020-08-17 18:15发布

问题:

I've stumbled upon a minor problem with pandas and it's method to_dict. I have a table that I'm certain have equal number of identical columns in each row, let's say it looks like that:

+----|----|----+
|COL1|COL2|COL3|
+----|----|----+
|VAL1|    |VAL3|
|    |VAL2|VAL3|
|VAL1|VAL2|    |
+----|----|----+

When I do df.to_dict(orient='records') I get:

[{
     "COL1":"VAL1"
     ,"COL2":nan
     ,"COL3":"VAL3"
 }
 ,{
     "COL1":None
     ,"COL2":"VAL2"
     ,"COL3":"VAL3"
 }
 ,{
     "COL1":"VAL1"
     ,"COL2":"VAL2"
     ,"COL3":nan
}]

Notice nan's in some columns and None's in other (always the same, there appears to be no nan and None in same column)

And when I do json.loads(df.to_json(orient='records')) i get only None and no nan's (which is desired output).

Like this:

[{
     "COL1":"VAL1"
     ,"COL2":None
     ,"COL3":"VAL3"
 }
 ,{
     "COL1":None
     ,"COL2":"VAL2"
     ,"COL3":"VAL3"
 }
 ,{
     "COL1":"VAL1"
     ,"COL2":"VAL2"
     ,"COL3":None
}]

I would appreciate some explanation as to why it happens and if it can be controlled in some way.

==EDIT==

According to comments it would be better to first replace those nan's with None's, but those nan's are not np.nan:

>>> a = df.head().ix[0,60]
>>> a
nan
>>> type(a)
<class 'numpy.float64'>
>>> a is np.nan
False
>>> a == np.nan
False

回答1:

I think you can only replace, it is not possible control in to_dict:

L = [{
     "COL1":"VAL1"
     ,"COL2":np.nan
     ,"COL3":"VAL3"
 }
 ,{
     "COL1":None
     ,"COL2":"VAL2"
     ,"COL3":"VAL3"
 }
 ,{
     "COL1":"VAL1"
     ,"COL2":"VAL2"
     ,"COL3":np.nan
}]

df = pd.DataFrame(L).replace({np.nan:None})
print (df)
   COL1  COL2  COL3
0  VAL1  None  VAL3
1  None  VAL2  VAL3
2  VAL1  VAL2  None

print (df.to_dict(orient='records'))
[{'COL3': 'VAL3', 'COL2': None, 'COL1': 'VAL1'}, 
 {'COL3': 'VAL3', 'COL2': 'VAL2', 'COL1': None}, 
 {'COL3': None, 'COL2': 'VAL2', 'COL1': 'VAL1'}]