How to remove the current node, while iterating through all nodes from root by getiterator()
function?
import xml.etree.ElementTree as ET
tree = ET.parse('file.xml')
root = tree.getroot()
for node in root.getiterator():
#if some condition:
#remove(node)
You can't remove nodes without knowing the parent, but the xml.etree
package doesn't give you any way to access a parent from a given node.
The only way around this is matching the parent node instead:
for node in root.iter():
if some_condition_matches_parent:
for child in list(node.iter()):
if some_condition_matches_child:
node.remove(child)
If you switch to the lxml
library (which implements the same API, but with additional enhancements), you can retrieve the parent node from any given node:
node.getparent().remove(node)
Note, while the pure-Python implementation of Element.getiterator()
returns a list object, in the C implementation of the ElementTree module (a separate import on Python 2, transparently imported on Python 3 if available) the getiterator()
method returns a live generator which requires a copy to be made.
On top of that, the Element.getiterator()
method has been deprecated in Python 3.2 and will be removed altogether in Python 3.9. I replaced its use with node.iter()
in the outer loop, and list(node.iter())
in the inner.