Python JSON encoder convert NaNs to 'null'

2020-03-01 03:40发布

问题:

I'm writing code to receive an arbitrary object (possibly nested) capable of being converted to JSON.

The default behavior for Python's builtin JSON encoder is to convert NaNs to NaN, e.g. json.dumps(np.NaN) results in NaN. How can I change this NaN value to 'null'?

I tried to subclass JSONEncoder and override the default() method as follows:

from json import JSONEncoder, dumps
import numpy as np

class NanConverter(JSONEncoder):
    def default(self, obj):
        try:
            _ = iter(obj)
        except TypeError:
            if isinstance(obj, float) and np.isnan(obj):
                return "null"
        return JSONEncoder.default(self, obj)

>>> d = {'a': 1, 'b': 2, 'c': 3, 'e': np.nan, 'f': [1, np.nan, 3]}
>>> dumps(d, cls=NanConverter)
'{"a": 1, "c": 3, "b": 2, "e": NaN, "f": [1, NaN, 3]}'

EXPECTED RESULT: '{"a": 1, "c": 3, "b": 2, "e": null, "f": [1, null, 3]}'

回答1:

This seems to achieve my objective:

import simplejson


>>> simplejson.dumps(d, ignore_nan=True)
Out[3]: '{"a": 1, "c": 3, "b": 2, "e": null, "f": [1, null, 3]}'


回答2:

Unfortunately, you probably need to use @Bramar's suggestion. You're not going to be able to use this directly. The documentation for Python's JSON encoder states:

If specified, default is a function that gets called for objects that can’t otherwise be serialized

Your NanConverter.default method isn't even being called, since Python's JSON encoder already knows how to serialize np.nan. Add some print statements - you'll see your method isn't even being called.



回答3:

  1. As @Gerrat points out, your hook dumps(d, cls=NanConverter) unfortunately won't work.

  2. @Alexander's simplejson.dumps(d, ignore_nan=True) works but introduces an additional dependency (simplejson).

If we introduce another dependency (pandas):

  1. Another obvious solution would be dumps(pd.DataFrame(d).fillna(None)), but Pandas issue 1972 notes that d.fillna(None) will have unpredictable behaviour:

    Note that fillna(None) is equivalent to fillna(), which means the value parameter is unused. Instead, it uses the method parameter which is by default forward fill.

  2. So instead, use DataFrame.where:

    df = pd.DataFrame(d)
    dumps(df.where(pd.notnull(df), None)))
    


回答4:

simplejson will do the right work here, but there's one extra flag worth including:

Try using simplejson:

pip install simplejson

Then in the code:

import simplejson

response = df.to_dict('records')
simplejson.dumps(response, ignore_nan=True,default=datetime.datetime.isoformat)

The ignore_nan flag will handle correctly all NaN --> null conversions

The default flag will allow simplejson to parse your datetimes correctly.