XML Declaration standalone=“yes” lxml

2019-01-20 09:08发布

问题:

I have an xml I am parsing, making some changes and saving out to a new file. It has the declaration <?xml version="1.0" encoding="utf-8" standalone="yes"?> which I would like to keep. When I am saving out my new file I am loosing the standalone="yes" bit. How can I keep it in? Here is my code:

templateXml = """<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<package>
  <provider>Some Data</provider>
  <studio_display_name>Some Other Data</studio_display_name>
</package>"""

from lxml import etree
tree = etree.fromstring(templateXml)

xmlFileOut = '/Users/User1/Desktop/Python/Done.xml'   

with open(xmlFileOut, "w") as f:
    f.write(etree.tostring(tree, pretty_print = True, xml_declaration = True, encoding='UTF-8'))

回答1:

You can pass standalone keyword argument to tostring():

etree.tostring(tree, pretty_print = True, xml_declaration = True, encoding='UTF-8', standalone="yes")


回答2:

Specify standalone using tree.docinfo.standalone.

Try following:

from lxml import etree
tree = etree.fromstring(templateXml).getroottree() # NOTE: .getroottree()

xmlFileOut = '/Users/User1/Desktop/Python/Done.xml'   

with open(xmlFileOut, "w") as f:
    f.write(etree.tostring(tree, pretty_print=True, xml_declaration=True,
                           encoding=tree.docinfo.encoding,
                           standalone=tree.docinfo.standalone))


回答3:

If you want to show the standalone='no' argument in your XML header, you have to set it to False instead of 'no'. Just like this:

etree.tostring(tree, pretty_print = True, xml_declaration = True, encoding='UTF-8', standalone=False)

If not, standalone will be set to 'yes' by default.



回答4:

etree.tostring(tree, pretty_print = True, xml_declaration = True, encoding='UTF-8')

Will add the declaration if you're using lxml, however I noticed their declaration uses semi-quotes instead of full quotes.

You can also get the exact declaration you want by just concatenating the output with a static string you need:

xml = etree.tostring(tree, pretty_print = True, encoding='UTF-8')
xml = '<?xml version=\"1.0\" encoding=\"utf-8\"?>\n' + xml


回答5:

If You want to disable outputting standalone at all pass None instead of True or False. Sounds logical but took some time to actually find and test it.

etree.tostring(tree, xml_declaration = True, encoding='utf-8', standalone=None)

or using context manager and stream etree.xmlfile serialization:

with etree.xmlfile(open('/tmp/test.xml'), encoding='utf-8') as xf:
    xf.write_declaration(standalone=None)
    with xf.element('html'):
        with xf.element('body'):
            element = etree.Element('p')
            element.text = 'This is paragraph'
            xf.write(element)