Python: lxml.etree.tostring(with_comments=False)

2020-07-13 08:40发布

问题:

I call the following command and get the following error:

>>>lxml.etree.tostring([tree].getroot(), with_comments=False)
ValueError: Can only discard comments in C14N serialisation

I don't know what C14N is, but I would appreciate an explanation of how I can achieve it and run the foregoing command with with_comments=False. (Yes, I'm aware that I can strip the comments using regex. Please don't offer regular expressions as a solution.)

Background: I want to transfer my xml doc over an http connection. I'm using the lxml Python library. I'm running on Python 2.7.1

回答1:

The lxml.etree.tostring doc says:

The exclusive and with_comments arguments are only used with C14N output, where they request exclusive and uncommented C14N serialisation respectively.

That parameter is only valid when using method='c14n'. You can omit it, and as far as I know, it will not include comments. Even if it did, the xml parser on the receiving end should ignore them, so unless there's a bandwidth concern or you have a specific problem with it, I wouldn't worry about it.



回答2:

You can remove the comments at parsing time:

parser = etree.XMLParser(remove_comments=True)
tree = etree.parse(xmlfile, parser=parser)

Or when using objectify (took me hell of a time to find this out):

parser = objectify.makeparser(remove_comments=True)
tree = objectify.parse(xmlfile, parser=parser)