lxml not adding newlines when inserting a new elem

2019-03-25 19:59发布

问题:

I have a large set of existing xml files, and I am trying to add one element to all of them (they are pom.xml for a number of maven projects, and I am trying to add a parent element to all of them). The following is my exact code.

The problem is that the final xml output in pom2.xml has the complete parent element in a single line. Though, when I print the element by itself, it writes it out in 4 lines as usual. How do I print out the complete xml with proper formatting for the parent element?

from lxml import etree

parentPom = etree.Element('parent')
groupId = etree.Element('groupId')
groupId.text = 'org.myorg'
parentPom.append(groupId)

artifactId = etree.Element('artifactId')
artifactId.text = 'myorg-master-pom'
parentPom.append(artifactId)

version = etree.Element('version')
version.text = '1.0.0'
parentPom.append(version)

print etree.tostring(parentPom, pretty_print=True)

pom = etree.parse("pom.xml")
projectElement = pom.getroot()
projectElement.insert(0, parentPom)

file = open("pom2.xml", 'wb')
file.write(etree.tostring(projectElement, pretty_print=True))
file.close()

Output of print:

<parent>
  <groupId>org.myorg</groupId>
  <artifactId>myorg-master-pom</artifactId>
  <version>1.0.0</version>
</parent>

Output of same element in pom2.xml:

<parent><groupId>com.inmobi</groupId><artifactId>inmobi-master-pom</artifactId><version>1.0.1</version></parent><modelVersion>4.0.0</modelVersion>

回答1:

This might be of intrest to you.

http://lxml.de/FAQ.html#why-doesn-t-the-pretty-print-option-reformat-my-xml-output

In short for future reference:

parser = etree.XMLParser(remove_blank_text=True)
pom = etree.parse("pom.xml",parser)


标签: python lxml