I have a program that reads an xml document from a socket. I have the xml document stored in a string which I would like to convert directly to a Python dictionary, the same way it is done in Django's simplejson
library.
Take as an example:
str ="<?xml version="1.0" ?><person><name>john</name><age>20</age></person"
dic_xml = convert_to_dic(str)
Then dic_xml
would look like {'person' : { 'name' : 'john', 'age' : 20 } }
I have a recursive method to get a dictionary from a lxml element
You can do this quite easily with lxml. First install it:
Here is a recursive function I wrote that does the heavy lifting for you:
The below variant preserves the parent key / element:
If you want to only return a subtree and convert it to dict, you can use Element.find() to get the subtree and then convert it:
See the lxml docs here. I hope this helps!
The code from http://code.activestate.com/recipes/410469-xml-as-dictionary/ works well, but if there are multiple elements that are the same at a given place in the hierarchy it just overrides them.
I added a shim between that looks to see if the element already exists before self.update(). If so, pops the existing entry and creates a lists out of the existing and the new. Any subsequent duplicates are added to the list.
Not sure if this can be handled more gracefully, but it works:
xmltodict (full disclosure: I wrote it) does exactly that:
@dibrovsd: Solution will not work if the xml have more than one tag with same name
On your line of thought, I have modified the code a bit and written it for general node instead of root:
From @K3---rnc response (the best for me) I've added a small modifications to get an OrderedDict from an XML text (some times order matters):
Following @K3---rnc example, you can use it:
Hope it helps ;)