I have a dictionary in Python that I would like to serialize in JSON and convert to a proper C string so that it contains a valid JSON string that corresponds to my input dictionary. I'm using the result to autogenerate a line in a C source file. Got it? Here's an example:
>>> import json
>>> mydict = {'a':1, 'b': 'a string with "quotes" and \t and \\backslashes'}
>>> json.dumps(mydict)
'{"a": 1, "b": "a string with \\"quotes\\" and \\t and \\\\backslashes"}'
>>> print(json.dumps(mydict))
{"a": 1, "b": "a string with \"quotes\" and \t and \\backslashes"}
What I need to generate is the following C string:
"{\"a\": 1, \"b\": \"a string with \\\"quotes\\\" and \\t and \\\\backslashes\"}"
In other words, I need to escape the backslash and double-quote on the result of calling json.dumps(mydict). At least I think I do.... Will the following work? Or am I missing an obvious corner case?
>>> s = '"'+json.dumps(mydict).replace('\\','\\\\').replace('"','\\"')+'"'
>>> print s
"{\"a\": 1, \"b\": \"a string with \\\"quotes\\\" and \\t and \\\\backslashes\"}"
A C string starts with a quote and ends with a quote, has no embedded nulls, has all embedded quotes escaped with backslash, and all embedded backslash literals are doubled.
So take your string, double the backslashes and escape the quotes with a backslash. I think your code is exactly what you need:
s = '"' + json.dumps(mydict).replace('\\', r'\\').replace('"', r'\"') + '"'
Alternatively, you could go for this slightly less robust version:
def c_string(s):
all_chars = (chr(x) for x in range(256))
trans_table = dict((c, c) for c in all_chars)
trans_table.update({'"': r'\"', '\\': r'\\'})
return "".join(trans_table[c] for c in s)
def dwarf_string(d):
import json
return '"' + c_string(json.dumps(d)) + '"'
I'd love to use string.maketrans()
but a translation table can map a character to at most a single character.
Your original suggestion and the answer from hughdbrown looks correct to me, but I've found a slightly shorter answer:
c_string = json.dumps( json.dumps(mydict) )
test script:
>>> import json
>>> mydict = {'a':1, 'b': 'a string with "quotes" and \t and \\backslashes'}
>>> c_string = json.dumps( json.dumps(mydict) )
>>> print( c_string )
"{\"a\": 1, \"b\": \"a string with \\\"quotes\\\" and \\t and \\\\backslashes\"}"
which looks like exactly the proper C string you want.
(Fortunately Python's "json.dumps()" passes forward-slashes straight through without change -- unlike some JSON encoders that prefix each forward-slash with a backslash.
Such as the one described at Processing escaped url strings within json using python ).
Maybe this is what you want:
repr(json.dumps(mydict))