Custom JSON encoder in Python 2.7 to insert plain

2019-01-28 10:19发布

问题:

I'm trying to upgrade a portion of Python 2.6 code to Python 2.7. This code uses the json module to produce some JavaScript (not JSON-compliant), which is then inserted into the rest of a script.

The general idea is to be able to insert code or refer to variables that are defined elsewhere: it's not intended to be used as JSON data, but JavaScript code.

Here is the custom encoder that works in Python 2.6:

import json

class RawJavaScriptText:
    def __init__(self, jstext):
        self._jstext = jstext
    def get_jstext(self):
        return self._jstext

class RawJsJSONEncoder(json.JSONEncoder):
    def _iterencode_default(self, o, markers=None):
        if isinstance(o, RawJavaScriptText):
            yield self.default(o)
        else:
            json.JSONEncoder._iterencode_default(self, o, markers)

    def default(self, o):
        if isinstance(o, RawJavaScriptText):
            return o.get_jstext()
        else:
            return json.JSONEncoder.default(self, o)

testvar = {
   'a': 1,
   'b': 'abc',
   # RawJavaScriptText will be inserted as such, no serialisation.
   'c': RawJavaScriptText('function() { return "Hello World"; }'),
   'd': RawJavaScriptText('some_variable_name')
}

print json.dumps(testvar, cls=RawJsJSONEncoder)

Using Python 2.6, we get the required result:

{ "a": 1, "c": function() { return "Hello World"; },
  "b": "abc", "d": some_variable_name }

Using Python 2.7, everything is turned into a string, thereby losing the validity of JavaScript code:

{ "a": 1, "c": "function() { return \"Hello World\"; }",
  "b": "abc", "d": "some_variable_name" }

(As a side note, this is only every used with a pre-defined set of raw JavaScript value, so as to prevent potential injections or misuse.)

Of course, the reason for this is that _iterencode_default method of JSONEncoder doesn't exist in the Python 2.7 version of the json module. Admittedly, it wasn't meant to be overridden in the first place.

Is there another way to achieve this goal in Python 2.7? Using the foundations of a JSON library to be able to generate JavaScript code this way is rather convenient.

EDIT: Here is complete working solution, using replace as suggested by James Henstridge. I'm using random UUIDs for the replacement tokens, which should prevent any conflicts. This way, this is a direct replacement working with both Python 2.6 and 2.7.

import json
import uuid

class RawJavaScriptText:
    def __init__(self, jstext):
        self._jstext = jstext
    def get_jstext(self):
        return self._jstext

class RawJsJSONEncoder(json.JSONEncoder):
    def __init__(self, *args, **kwargs):
        json.JSONEncoder.__init__(self, *args, **kwargs)
        self._replacement_map = {}

    def default(self, o):
        if isinstance(o, RawJavaScriptText):
            key = uuid.uuid4().hex
            self._replacement_map[key] = o.get_jstext()
            return key
        else:
            return json.JSONEncoder.default(self, o)

    def encode(self, o):
        result = json.JSONEncoder.encode(self, o)
        for k, v in self._replacement_map.iteritems():
             result = result.replace('"%s"' % (k,), v)
        return result

testvar = {
   'a': 1,
   'b': 'abc',
   'c': RawJavaScriptText('function() { return "Hello World"; }'),
   'd': [ RawJavaScriptText('some_variable_name') ],
   'e': {
       'x': RawJavaScriptText('some_variable_name'),
       'y': 'y'
   }
}

print json.dumps(testvar, cls=RawJsJSONEncoder)

Result (2.6 and 2.7):

{"a": 1, "c": function() { return "Hello World"; },
 "b": "abc",
 "e": {"y": "y", "x": some_variable_name},
 "d": [some_variable_name]}

回答1:

The undocumented private interface you were using appears to have gone away when the C extension used under the covers was expanded to cover more of the encoding process.

One alternative would be to insert place holder strings for your RawJavaScriptText values and post-process the output of dumps to convert those place holders to the form you require.

For example:

>>> data = {'foo': '@@x@@'}
>>> print json.dumps(data)
{"foo": "@@x@@"}
>>> print json.dumps(data).replace('"@@x@@"', 'some_variable_name')
{"foo": some_variable_name}

You'll want to be careful about this kind of technique if your JSON includes untrusted data: you don't want to be in a position where an outsider can add such place holders to the output unexpectedly.