I'm writing code to receive an arbitrary object (possibly nested) capable of being converted to JSON.
The default behavior for Python's builtin JSON encoder is to convert NaNs to NaN
, e.g. json.dumps(np.NaN)
results in NaN
. How can I change this NaN
value to 'null'
?
I tried to subclass JSONEncoder
and override the default()
method as follows:
from json import JSONEncoder, dumps
import numpy as np
class NanConverter(JSONEncoder):
def default(self, obj):
try:
_ = iter(obj)
except TypeError:
if isinstance(obj, float) and np.isnan(obj):
return "null"
return JSONEncoder.default(self, obj)
>>> d = {'a': 1, 'b': 2, 'c': 3, 'e': np.nan, 'f': [1, np.nan, 3]}
>>> dumps(d, cls=NanConverter)
'{"a": 1, "c": 3, "b": 2, "e": NaN, "f": [1, NaN, 3]}'
EXPECTED RESULT: '{"a": 1, "c": 3, "b": 2, "e": null, "f": [1, null, 3]}'
This seems to achieve my objective:
import simplejson
>>> simplejson.dumps(d, ignore_nan=True)
Out[3]: '{"a": 1, "c": 3, "b": 2, "e": null, "f": [1, null, 3]}'
Unfortunately, you probably need to use @Bramar's suggestion. You're not going to be able to use this directly. The documentation for Python's JSON encoder states:
If specified, default is a function that gets called for objects that can’t otherwise be serialized
Your NanConverter.default
method isn't even being called, since Python's JSON encoder already knows how to serialize np.nan
. Add some print statements - you'll see your method isn't even being called.
As @Gerrat points out, your hook dumps(d, cls=NanConverter)
unfortunately won't work.
@Alexander's simplejson.dumps(d, ignore_nan=True)
works but introduces an additional dependency (simplejson
).
If we introduce another dependency (pandas):
Another obvious solution would be dumps(pd.DataFrame(d).fillna(None))
, but Pandas issue 1972 notes that d.fillna(None)
will have unpredictable behaviour:
Note that fillna(None)
is equivalent to fillna()
, which means the value parameter is unused. Instead, it uses the method parameter which is by default forward fill.
So instead, use DataFrame.where
:
df = pd.DataFrame(d)
dumps(df.where(pd.notnull(df), None)))
simplejson will do the right work here, but there's one extra flag worth including:
Try using simplejson:
pip install simplejson
Then in the code:
import simplejson
response = df.to_dict('records')
simplejson.dumps(response, ignore_nan=True,default=datetime.datetime.isoformat)
The ignore_nan flag will handle correctly all NaN --> null conversions
The default flag will allow simplejson to parse your datetimes correctly.