XML object serialization in python, are there any

2019-05-04 18:28发布

问题:

For a while I've been using a package called "gnosis-utils" which provides an XML pickling service for Python. This class works reasonably well, however it seems to have been neglected by it's developer for the last four years.

At the time we originally selected gnosis it was the only XML serization tool for Python. The advantage of Gnosis was that it provided a set of classes whose function was very similar to the built-in Python XML pickler. It produced XML which python-developers found easy to read, but non-python developers found confusing.

Now that the proejct has grown we have a new requirement: We need to be able to exchange XML with our colleagues who prefer Java or .Net. These non-python developers will not be using Python - they intend to produce XML directly, hence we have a need to simplify the format of the XML.

So are there any alternatives to Gnosis. Our requirements:

  • Must work on Python 2.4 / Windows x86 32bit
  • Output must be XML, as simple as possible
  • API must resemble Pickle as closely as possible
  • Performance is not hugely important

Of course we could simply adapt Gnosis, however we'd prefer to simply use a component which already provides the functions we requrie (assuming that it exists).

回答1:

So what you're looking for is a python library that spits out arbitrary XML for your objects? You don't need to control the format, so you can't be bothered to actually write something that iterates over the relevant properties of your data and generates the XML using one of the existing tools?

This seems like a bad idea. Arbitrary XML serialization doesn't sound like a good way to move forward. Any format that includes all of pickle's features is going to be ugly, verbose, and very nasty to use. It will not be simple. It will not translate well into Java.

What does your data look like?

If you tell us precisely what aspects of pickle you need (and why lxml.objectify doesn't fulfill those), we will be better able to help you.

Have you considered using JSON for your serialization? It's easy to parse, natively supports python-like data structures, and has wide-reaching support. As an added bonus, it doesn't open your code to all kinds of evil exploits the way the native pickle module does.

Honestly, you need to bite the bullet and define a format, and build a serializer using the standard XML tools, if you absolutely must use XML. Consider JSON.



回答2:

There is xml_marshaller which provides a simple way of dumping arbitrary Python objects to XML:

>>> from xml_marshaller import xml_marshaller
>>> class Foo(object): pass
>>> foo = Foo()
>>> foo.bar = 'baz'
>>> dump_str = xml_marshaller.dumps(foo)

Pretty printing the above with lxml (which is a dependency of xml_marshaller anyway):

>>> from lxml.etree import fromstring, tostring
>>> print tostring(fromstring(dump_str), pretty_print=True)

You get output like this:

<marshal>
  <object id="i2" module="__main__" class="Foo">
    <tuple/>
    <dictionary id="i3">
      <string>bar</string>
      <string>baz</string>
    </dictionary>
  </object>
</marshal>

I did not check for Python 2.4 compatibility since this question was asked long ago, but a solution for xml dumping arbitrary Python objects remains relevant.