Python parsing JSON with escaped double quotes

2019-07-12 08:55发布

问题:

Consider this valid json:

{"a": 1, "b": "{\"c\":2}"}

Python's json module throws when I try to parse it - it looks like the \" is throwing it off:

json.loads('{"a": 1, "b": "{\"c\":2}"}')
Traceback (most recent call last):
  File "", line 1, in 
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 365, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 381, in raw_decode
    obj, end = self.scan_once(s, idx)
ValueError: Expecting , delimiter: line 1 column 15 (char 14)

Is there any way to parse this in Python, either using the json module or some other module like ujson?

回答1:

Actually it doesn't matter with escaped double quotes. See my test:

>>> json.loads('{"a": 1, "b": "{\"c\":2}"}')
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/usr/lib/python3.4/json/__init__.py", line 318, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.4/json/decoder.py", line 343, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.4/json/decoder.py", line 359, in raw_decode
    obj, end = self.scan_once(s, idx)
ValueError: Expecting ',' delimiter: line 1 column 18 (char 17)

>>> json.loads('{"a": 1, "b": "{"c":2}"}')
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/usr/lib/python3.4/json/__init__.py", line 318, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.4/json/decoder.py", line 343, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.4/json/decoder.py", line 359, in raw_decode
    obj, end = self.scan_once(s, idx)
ValueError: Expecting ',' delimiter: line 1 column 18 (char 17)

>>> json.loads('{"a": 1, "b": {"c":2}}')
{'a': 1, 'b': {'c': 2}}

>>> json.loads('{"a": 1, "b": {\"c\":2}}')
{'a': 1, 'b': {'c': 2}}

>>> 


回答2:

Inside a quoted string, \" is treated the same as a regular quote:

>>> '{"a": 1, "b": "{\"c\":2}"}'
'{"a": 1, "b": "{"c":2}"}'

As a result, your string is not valid JSON.

You need to escape the backslashes as well, so that they are sent to loads. You can see this by encoding your desired dictionary with dumps:

>>> json.dumps({"a": 1, "b": "{\"c\": 2}"})
'{"a": 1, "b": "{\\"c\\": 2}"}'

>>> json.loads('{"a": 1, "b": "{\\"c\\": 2}"}')
{u'a': 1, u'b': u'{"c": 2}'}