Converting str to dict in python

2020-02-14 07:54发布

问题:

I got this from a process's output using subprocess.Popen() :

    { about: 'RRDtool xport JSON output',
  meta: {
    start: 1401778440,
    step: 60,
    end: 1401778440,
    legend: [
      'rta_MIN',
      'rta_MAX',
      'rta_AVERAGE'
          ]
     },
  data: [
    [ null, null, null ],
    [ null, null, null ],
    [ null, null, null ],
    [ null, null, null ],
    [ null, null, null ],
    [ null, null, null  ]
  ]
}

It doesn't seem to be a valid json to me. I have used ast.literal_eval() and json.loads(), but with no luck. Can someone help me in the right direction ? Thanks in advance.

回答1:

Indeed, older versions of rddtool export ECMA-script, not JSON. According to this debian bug report upgrading 1.4.8 should give you proper JSON. Also see the project CHANGELOG:

JSON output of xport is now actually json compilant by its keys being properly quoted now.

If you cannot upgrade, you have two options here; either attempt to reformat to apply quoting the object key identifiers, or use a parser that's more lenient and parses ECMA-script object notation.

The latter can be done with the external demjson library:

>>> import demjson
>>> demjson.decode('''\
... { about: 'RRDtool xport JSON output',
...   meta: {
...     start: 1401778440,
...     step: 60,
...     end: 1401778440,
...     legend: [
...       'rta_MIN',
...       'rta_MAX',
...       'rta_AVERAGE'
...           ]
...      },
...   data: [
...     [ null, null, null ],
...     [ null, null, null ],
...     [ null, null, null ],
...     [ null, null, null ],
...     [ null, null, null ],
...     [ null, null, null  ]
...   ]
... }''')
{u'about': u'RRDtool xport JSON output', u'meta': {u'start': 1401778440, u'step': 60, u'end': 1401778440, u'legend': [u'rta_MIN', u'rta_MAX', u'rta_AVERAGE']}, u'data': [[None, None, None], [None, None, None], [None, None, None], [None, None, None], [None, None, None], [None, None, None]]}

Repairing can be done using a regular expression; I am going to assume that all identifiers are on a new line or directly after the opening { curly brace. Single quotes in the list will have to be changed to double quotes; this will only work if there are no embedded single quotes in the values too:

import re
import json

yourtext = re.sub(r'(?:^|(?<={))\s*(\w+)(?=:)', r' "\1"', yourtext, flags=re.M)
yourtext = re.sub(r"'", r'"', yourtext)
data = json.loads(yourtext)


回答2:

It is indeed not valid JSON. It is, however, valid YAML, so the third-party PyYAML library might help you out:

>>> import yaml
>>> yaml.load(text)
{
    'about': 'RRDtool xport JSON output',
    'meta': {
        'start': 1401778440,
        'step': 60,
        'end': 1401778440,
        'legend': [
            'rta_MIN',
            'rta_MAX',
            'rta_AVERAGE'
        ]
    },
    'data': [
        [None, None, None],
        [None, None, None],
        [None, None, None],
        [None, None, None],
        [None, None, None],
        [None, None, None]
    ]
}