I've stumbled upon a minor problem with pandas and it's method to_dict. I have a table that I'm certain have equal number of identical columns in each row, let's say it looks like that:
+----|----|----+
|COL1|COL2|COL3|
+----|----|----+
|VAL1| |VAL3|
| |VAL2|VAL3|
|VAL1|VAL2| |
+----|----|----+
When I do df.to_dict(orient='records')
I get:
[{
"COL1":"VAL1"
,"COL2":nan
,"COL3":"VAL3"
}
,{
"COL1":None
,"COL2":"VAL2"
,"COL3":"VAL3"
}
,{
"COL1":"VAL1"
,"COL2":"VAL2"
,"COL3":nan
}]
Notice nan
's in some columns and None
's in other (always the same, there appears to be no nan
and None
in same column)
And when I do json.loads(df.to_json(orient='records'))
i get only None
and no nan
's (which is desired output).
Like this:
[{
"COL1":"VAL1"
,"COL2":None
,"COL3":"VAL3"
}
,{
"COL1":None
,"COL2":"VAL2"
,"COL3":"VAL3"
}
,{
"COL1":"VAL1"
,"COL2":"VAL2"
,"COL3":None
}]
I would appreciate some explanation as to why it happens and if it can be controlled in some way.
==EDIT==
According to comments it would be better to first replace those nan
's with None
's, but those nan
's are not np.nan
:
>>> a = df.head().ix[0,60]
>>> a
nan
>>> type(a)
<class 'numpy.float64'>
>>> a is np.nan
False
>>> a == np.nan
False