Generating xml in python and lxml

2019-02-01 02:43发布

问题:

I have this xml from sql, and I want to do the same by python 2.7 and lxml

<?xml version="1.0" encoding="utf-16"?>
<results>
  <Country name="Germany" Code="DE" Storage="Basic" Status="Fresh" Type="Photo" />
</results>

Now I have:

from lxml import etree

# create XML 
results= etree.Element('results')

country= etree.Element('country')
country.text = 'Germany'
root.append(country)



filename = "xmltestthing.xml"
FILE = open(filename,"w")
FILE.writelines(etree.tostring(root, pretty_print=True))
FILE.close()

Do you know how to add rest of attributes?

回答1:

Note this also prints the BOM

>>> from lxml.etree import tostring
>>> from lxml.builder import E
>>> print tostring(
             E.results(
                 E.Country(name='Germany',
                           Code='DE',
                           Storage='Basic',
                           Status='Fresh',
                           Type='Photo')
             ), pretty_print=True, xml_declaration=True, encoding='UTF-16')

��<?xml version='1.0' encoding='UTF-16'?>
<results>
  <Country Status="Fresh" Type="Photo" Code="DE" Storage="Basic" name="Germany"/>
</results>


回答2:

from lxml import etree

# Create the root element
page = etree.Element('results')

# Make a new document tree
doc = etree.ElementTree(page)

# Add the subelements
pageElement = etree.SubElement(page, 'Country', 
                                      name='Germany',
                                      Code='DE',
                                      Storage='Basic')
# For multiple multiple attributes, use as shown above

# Save to XML file
outFile = open('output.xml', 'w')
doc.write(outFile, xml_declaration=True, encoding='utf-16') 


回答3:

Promoting my comment to an answer:

@sukbir is probably not using Windows. What happens is that lxml writes a newline (0A 00 in UTF-16LE) between the XML header and the body. This is then molested by Win text mode to become 0D 0A 00 which makes everything after that look like UTF-16BE hence the Chinese etc characters when you display it. You can get around this in this instance by using "wb" instead of "w" when you open the file. However I'd strongly suggest that you use 'UTF-8' (spelled EXACTLY like that) as your encoding. Why are you using UTF-16? You like large files and/or weird problems?



回答4:

Save to XML file

doc.write('output.xml', xml_declaration=True, encoding='utf-16') 

instead of:

outFile = open('output.xml', 'w')

doc.write(outFile, xml_declaration=True, encoding='utf-16') 


标签: python xml lxml