I have seen many projects using simplejson
module instead of json
module from the Standard Library. Also, there are many different simplejson
modules. Why would use these alternatives, instead of the one in the Standard Library?
问题:
回答1:
json
is simplejson
, added to the stdlib. But since json
was added in 2.6, simplejson
has the advantage of working on more Python versions (2.4+).
simplejson
is also updated more frequently than Python, so if you need (or want) the latest version, it\'s best to use simplejson
itself, if possible.
A good practice, in my opinion, is to use one or the other as a fallback.
try:
import simplejson as json
except ImportError:
import json
回答2:
I have to disagree with the other answers: the built in json
library (in Python 2.7) is not necessarily slower than simplejson
. It also doesn\'t have this annoying unicode bug.
Here is a simple benchmark:
import json
import simplejson
from timeit import repeat
NUMBER = 100000
REPEAT = 10
def compare_json_and_simplejson(data):
\"\"\"Compare json and simplejson - dumps and loads\"\"\"
compare_json_and_simplejson.data = data
compare_json_and_simplejson.dump = json.dumps(data)
assert json.dumps(data) == simplejson.dumps(data)
result = min(repeat(\"json.dumps(compare_json_and_simplejson.data)\", \"from __main__ import json, compare_json_and_simplejson\",
repeat = REPEAT, number = NUMBER))
print \" json dumps {} seconds\".format(result)
result = min(repeat(\"simplejson.dumps(compare_json_and_simplejson.data)\", \"from __main__ import simplejson, compare_json_and_simplejson\",
repeat = REPEAT, number = NUMBER))
print \"simplejson dumps {} seconds\".format(result)
assert json.loads(compare_json_and_simplejson.dump) == data
result = min(repeat(\"json.loads(compare_json_and_simplejson.dump)\", \"from __main__ import json, compare_json_and_simplejson\",
repeat = REPEAT, number = NUMBER))
print \" json loads {} seconds\".format(result)
result = min(repeat(\"simplejson.loads(compare_json_and_simplejson.dump)\", \"from __main__ import simplejson, compare_json_and_simplejson\",
repeat = REPEAT, number = NUMBER))
print \"simplejson loads {} seconds\".format(result)
print \"Complex real world data:\"
COMPLEX_DATA = {\'status\': 1, \'timestamp\': 1362323499.23, \'site_code\': \'testing123\', \'remote_address\': \'212.179.220.18\', \'input_text\': u\'ny monday for less than \\u20aa123\', \'locale_value\': \'UK\', \'eva_version\': \'v1.0.3286\', \'message\': \'Successful Parse\', \'muuid1\': \'11e2-8414-a5e9e0fd-95a6-12313913cc26\', \'api_reply\': {\"api_reply\": {\"Money\": {\"Currency\": \"ILS\", \"Amount\": \"123\", \"Restriction\": \"Less\"}, \"ProcessedText\": \"ny monday for less than \\\\u20aa123\", \"Locations\": [{\"Index\": 0, \"Derived From\": \"Default\", \"Home\": \"Default\", \"Departure\": {\"Date\": \"2013-03-04\"}, \"Next\": 10}, {\"Arrival\": {\"Date\": \"2013-03-04\", \"Calculated\": True}, \"Index\": 10, \"All Airports Code\": \"NYC\", \"Airports\": \"EWR,JFK,LGA,PHL\", \"Name\": \"New York City, New York, United States (GID=5128581)\", \"Latitude\": 40.71427, \"Country\": \"US\", \"Type\": \"City\", \"Geoid\": 5128581, \"Longitude\": -74.00597}]}}}
compare_json_and_simplejson(COMPLEX_DATA)
print \"\\nSimple data:\"
SIMPLE_DATA = [1, 2, 3, \"asasd\", {\'a\':\'b\'}]
compare_json_and_simplejson(SIMPLE_DATA)
And the results on my system (Python 2.7.4, Linux 64-bit):
Complex real world data:
json dumps 1.56666707993 seconds
simplejson dumps 2.25638604164 seconds
json loads 2.71256899834 seconds
simplejson loads 1.29233884811 secondsSimple data:
json dumps 0.370109081268 seconds
simplejson dumps 0.574181079865 seconds
json loads 0.422876119614 seconds
simplejson loads 0.270955085754 seconds
For dumping, json
is faster than simplejson
.
For loading, simplejson
is faster.
Since I am currently building a web service, dumps()
is more important—and using a standard library is always preferred.
Also, cjson
was not updated in the past 4 years, so I wouldn\'t touch it.
回答3:
All of these answers aren\'t very helpful because they are time sensitive.
After doing some research of my own I found that simplejson
is indeed faster than the builtin, if you keep it updated to the latest version.
pip/easy_install
wanted to install 2.3.2 on ubuntu 12.04, but after finding out the latest simplejson
version is actually 3.3.0, so I updated it and reran the time tests.
simplejson
is about 3x faster than the builtinjson
at loadssimplejson
is about 30% faster than the builtinjson
at dumps
Disclaimer:
The above statements are in python-2.7.3 and simplejson 3.3.0 (with c speedups) And to make sure my answer also isn\'t time sensitive, you should run your own tests to check since it varies so much between versions; there\'s no easy answer that isn\'t time sensitive.
How to tell if C speedups are enabled in simplejson:
import simplejson
# If this is True, then c speedups are enabled.
print bool(getattr(simplejson, \'_speedups\', False))
UPDATE: I recently came across a library called ujson that is performing ~3x faster than simplejson
with some basic tests.
回答4:
I\'ve been benchmarking json, simplejson and cjson.
- cjson is fastest
- simplejson is almost on par with cjson
- json is about 10x slower than simplejson
http://pastie.org/1507411:
$ python test_serialization_speed.py
--------------------
Encoding Tests
--------------------
Encoding: 100000 x {\'m\': \'asdsasdqwqw\', \'t\': 3}
[ json] 1.12385 seconds for 100000 runs. avg: 0.011239ms
[simplejson] 0.44356 seconds for 100000 runs. avg: 0.004436ms
[ cjson] 0.09593 seconds for 100000 runs. avg: 0.000959ms
Encoding: 10000 x {\'m\': [[\'0\', 1, \'2\', 3, \'4\', 5, \'6\', 7, \'8\', 9, \'10\', 11, \'12\', 13, \'14\', 15, \'16\', 17, \'18\', 19], [\'0\', 1, \'2\', 3, \'4\', 5, \'6\', 7, \'8\', 9, \'10\', 11, \'12\', 13, \'14\', 15, \'16\', 17, \'18\', 19], [\'0\', 1, \'2\', 3, \'4\', 5, \'6\', 7, \'8\', 9, \'10\', 11, \'12\', 13, \'14\', 15, \'16\', 17, \'18\', 19], [\'0\', 1, \'2\', 3, \'4\', 5, \'6\', 7, \'8\', 9, \'10\', 11, \'12\', 13, \'14\', 15, \'16\', 17, \'18\', 19], [\'0\', 1, \'2\', 3, \'4\', 5, \'6\', 7, \'8\', 9, \'10\', 11, \'12\', 13, \'14\', 15, \'16\', 17, \'18\', 19], [\'0\', 1, \'2\', 3, \'4\', 5, \'6\', 7, \'8\', 9, \'10\', 11, \'12\', 13, \'14\', 15, \'16\', 17, \'18\', 19], [\'0\', 1, \'2\', 3, \'4\', 5, \'6\', 7, \'8\', 9, \'10\', 11, \'12\', 13, \'14\', 15, \'16\', 17, \'18\', 19], [\'0\', 1, \'2\', 3, \'4\', 5, \'6\', 7, \'8\', 9, \'10\', 11, \'12\', 13, \'14\', 15, \'16\', 17, \'18\', 19], [\'0\', 1, \'2\', 3, \'4\', 5, \'6\', 7, \'8\', 9, \'10\', 11, \'12\', 13, \'14\', 15, \'16\', 17, \'18\', 19], [\'0\', 1, \'2\', 3, \'4\', 5, \'6\', 7, \'8\', 9, \'10\', 11, \'12\', 13, \'14\', 15, \'16\', 17, \'18\', 19], [\'0\', 1, \'2\', 3, \'4\', 5, \'6\', 7, \'8\', 9, \'10\', 11, \'12\', 13, \'14\', 15, \'16\', 17, \'18\', 19], [\'0\', 1, \'2\', 3, \'4\', 5, \'6\', 7, \'8\', 9, \'10\', 11, \'12\', 13, \'14\', 15, \'16\', 17, \'18\', 19], [\'0\', 1, \'2\', 3, \'4\', 5, \'6\', 7, \'8\', 9, \'10\', 11, \'12\', 13, \'14\', 15, \'16\', 17, \'18\', 19], [\'0\', 1, \'2\', 3, \'4\', 5, \'6\', 7, \'8\', 9, \'10\', 11, \'12\', 13, \'14\', 15, \'16\', 17, \'18\', 19], [\'0\', 1, \'2\', 3, \'4\', 5, \'6\', 7, \'8\', 9, \'10\', 11, \'12\', 13, \'14\', 15, \'16\', 17, \'18\', 19], [\'0\', 1, \'2\', 3, \'4\', 5, \'6\', 7, \'8\', 9, \'10\', 11, \'12\', 13, \'14\', 15, \'16\', 17, \'18\', 19], [\'0\', 1, \'2\', 3, \'4\', 5, \'6\', 7, \'8\', 9, \'10\', 11, \'12\', 13, \'14\', 15, \'16\', 17, \'18\', 19], [\'0\', 1, \'2\', 3, \'4\', 5, \'6\', 7, \'8\', 9, \'10\', 11, \'12\', 13, \'14\', 15, \'16\', 17, \'18\', 19], [\'0\', 1, \'2\', 3, \'4\', 5, \'6\', 7, \'8\', 9, \'10\', 11, \'12\', 13, \'14\', 15, \'16\', 17, \'18\', 19], [\'0\', 1, \'2\', 3, \'4\', 5, \'6\', 7, \'8\', 9, \'10\', 11, \'12\', 13, \'14\', 15, \'16\', 17, \'18\', 19]], \'t\': 3}
[ json] 7.76628 seconds for 10000 runs. avg: 0.776628ms
[simplejson] 0.51179 seconds for 10000 runs. avg: 0.051179ms
[ cjson] 0.44362 seconds for 10000 runs. avg: 0.044362ms
--------------------
Decoding Tests
--------------------
Decoding: 100000 x {\"m\": \"asdsasdqwqw\", \"t\": 3}
[ json] 3.32861 seconds for 100000 runs. avg: 0.033286ms
[simplejson] 0.37164 seconds for 100000 runs. avg: 0.003716ms
[ cjson] 0.03893 seconds for 100000 runs. avg: 0.000389ms
Decoding: 10000 x {\"m\": [[\"0\", 1, \"2\", 3, \"4\", 5, \"6\", 7, \"8\", 9, \"10\", 11, \"12\", 13, \"14\", 15, \"16\", 17, \"18\", 19], [\"0\", 1, \"2\", 3, \"4\", 5, \"6\", 7, \"8\", 9, \"10\", 11, \"12\", 13, \"14\", 15, \"16\", 17, \"18\", 19], [\"0\", 1, \"2\", 3, \"4\", 5, \"6\", 7, \"8\", 9, \"10\", 11, \"12\", 13, \"14\", 15, \"16\", 17, \"18\", 19], [\"0\", 1, \"2\", 3, \"4\", 5, \"6\", 7, \"8\", 9, \"10\", 11, \"12\", 13, \"14\", 15, \"16\", 17, \"18\", 19], [\"0\", 1, \"2\", 3, \"4\", 5, \"6\", 7, \"8\", 9, \"10\", 11, \"12\", 13, \"14\", 15, \"16\", 17, \"18\", 19], [\"0\", 1, \"2\", 3, \"4\", 5, \"6\", 7, \"8\", 9, \"10\", 11, \"12\", 13, \"14\", 15, \"16\", 17, \"18\", 19], [\"0\", 1, \"2\", 3, \"4\", 5, \"6\", 7, \"8\", 9, \"10\", 11, \"12\", 13, \"14\", 15, \"16\", 17, \"18\", 19], [\"0\", 1, \"2\", 3, \"4\", 5, \"6\", 7, \"8\", 9, \"10\", 11, \"12\", 13, \"14\", 15, \"16\", 17, \"18\", 19], [\"0\", 1, \"2\", 3, \"4\", 5, \"6\", 7, \"8\", 9, \"10\", 11, \"12\", 13, \"14\", 15, \"16\", 17, \"18\", 19], [\"0\", 1, \"2\", 3, \"4\", 5, \"6\", 7, \"8\", 9, \"10\", 11, \"12\", 13, \"14\", 15, \"16\", 17, \"18\", 19], [\"0\", 1, \"2\", 3, \"4\", 5, \"6\", 7, \"8\", 9, \"10\", 11, \"12\", 13, \"14\", 15, \"16\", 17, \"18\", 19], [\"0\", 1, \"2\", 3, \"4\", 5, \"6\", 7, \"8\", 9, \"10\", 11, \"12\", 13, \"14\", 15, \"16\", 17, \"18\", 19], [\"0\", 1, \"2\", 3, \"4\", 5, \"6\", 7, \"8\", 9, \"10\", 11, \"12\", 13, \"14\", 15, \"16\", 17, \"18\", 19], [\"0\", 1, \"2\", 3, \"4\", 5, \"6\", 7, \"8\", 9, \"10\", 11, \"12\", 13, \"14\", 15, \"16\", 17, \"18\", 19], [\"0\", 1, \"2\", 3, \"4\", 5, \"6\", 7, \"8\", 9, \"10\", 11, \"12\", 13, \"14\", 15, \"16\", 17, \"18\", 19], [\"0\", 1, \"2\", 3, \"4\", 5, \"6\", 7, \"8\", 9, \"10\", 11, \"12\", 13, \"14\", 15, \"16\", 17, \"18\", 19], [\"0\", 1, \"2\", 3, \"4\", 5, \"6\", 7, \"8\", 9, \"10\", 11, \"12\", 13, \"14\", 15, \"16\", 17, \"18\", 19], [\"0\", 1, \"2\", 3, \"4\", 5, \"6\", 7, \"8\", 9, \"10\", 11, \"12\", 13, \"14\", 15, \"16\", 17, \"18\", 19], [\"0\", 1, \"2\", 3, \"4\", 5, \"6\", 7, \"8\", 9, \"10\", 11, \"12\", 13, \"14\", 15, \"16\", 17, \"18\", 19], [\"0\", 1, \"2\", 3, \"4\", 5, \"6\", 7, \"8\", 9, \"10\", 11, \"12\", 13, \"14\", 15, \"16\", 17, \"18\", 19]], \"t\": 3}
[ json] 37.26270 seconds for 10000 runs. avg: 3.726270ms
[simplejson] 0.56643 seconds for 10000 runs. avg: 0.056643ms
[ cjson] 0.33007 seconds for 10000 runs. avg: 0.033007ms
回答5:
Some values are serialized differently between simplejson and json.
Notably, instances of collections.namedtuple
are serialized as arrays by json
but as objects by simplejson
. You can override this behaviour by passing namedtuple_as_object=False
to simplejson.dump
, but by default the behaviours do not match.
>>> import collections, simplejson, json
>>> TupleClass = collections.namedtuple(\"TupleClass\", (\"a\", \"b\"))
>>> value = TupleClass(1, 2)
>>> json.dumps(value)
\'[1, 2]\'
>>> simplejson.dumps(value)
\'{\"a\": 1, \"b\": 2}\'
>>> simplejson.dumps(value, namedtuple_as_object=False)
\'[1, 2]\'
回答6:
An API incompatibility I found, with Python 2.7 vs simplejson 3.3.1 is in whether output produces str or unicode objects. e.g.
>>> from json import JSONDecoder
>>> jd = JSONDecoder()
>>> jd.decode(\"\"\"{ \"a\":\"b\" }\"\"\")
{u\'a\': u\'b\'}
vs
>>> from simplejson import JSONDecoder
>>> jd = JSONDecoder()
>>> jd.decode(\"\"\"{ \"a\":\"b\" }\"\"\")
{\'a\': \'b\'}
If the preference is to use simplejson, then this can be addressed by coercing the argument string to unicode, as in:
>>> from simplejson import JSONDecoder
>>> jd = JSONDecoder()
>>> jd.decode(unicode(\"\"\"{ \"a\":\"b\" }\"\"\", \"utf-8\"))
{u\'a\': u\'b\'}
The coercion does require knowing the original charset, for example:
>>> jd.decode(unicode(\"\"\"{ \"a\": \"ξηθννββωφρες\" }\"\"\"))
Traceback (most recent call last):
File \"<stdin>\", line 1, in <module>
UnicodeDecodeError: \'ascii\' codec can\'t decode byte 0xce in position 8: ordinal not in range(128)
This is the won\'t fix issue 40
回答7:
The builtin json
module got included in Python 2.6. Any projects that support versions of Python < 2.6 need to have a fallback. In many cases, that fallback is simplejson
.
回答8:
Another reason projects use simplejson is that the builtin json did not originally include its C speedups, so the performance difference was noticeable.
回答9:
Here\'s (a now outdated) comparison of Python json libraries:
Comparing JSON modules for Python (archive link)
Regardless of the results in this comparison you should use the standard library json if you are on Python 2.6. And.. might as well just use simplejson otherwise.
回答10:
simplejson module is simply 1,5 times faster than json (On my computer, with simplejson 2.1.1 and Python 2.7 x86).
If you want, you can try the benchmark: http://abral.altervista.org/jsonpickle-bench.zip On my PC simplejson is faster than cPickle. I would like to know also your benchmarks!
Probably, as said Coady, the difference between simplejson and json is that simplejson includes _speedups.c. So, why don\'t python developers use simplejson?
回答11:
In python3, if you a string of b\'bytes\'
, with json
you have to .decode()
the content before you can load it. simplejson
takes care of this so you can just do simplejson.loads(byte_string)
.
回答12:
I came across this question as I was looking to install simplejson for Python 2.6. I needed to use the \'object_pairs_hook\' of json.load() in order to load a json file as an OrderedDict. Being familiar with more recent versions of Python I didn\'t realize that the json module for Python 2.6 doesn\'t include the \'object_pairs_hook\' so I had to install simplejson for this purpose. From personal experience this is why i use simplejson as opposed to the standard json module.