how to recursively iterate over XML tags in Python

I am trying to iterate over all nodes in a tree using ElementTree.

I do something like:

  tree = ET.parse("/tmp/test.xml")

  root = tree.getroot()

  for child in root:
       ### do something with child

The problem is that child is an Element object and not ElementTree object, so I can't further look into it and recurse to iterate over its elements. Is there a way to iterate differently over "root" so that it iterates over the top level nodes in the tree (immediate children) and return the same class as root itself?

标签： python xml

4条回答

戒情不戒烟

2楼-- · 2020-02-08 06:40

To iterate over all nodes, use the iter method on the ElementTree, not the root Element.

The root is an Element, just like the other elements in the tree and only really has context of its own attributes and children. The ElementTree has the context for all Elements.

For example, given this xml

<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <rank>4</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
    </country>
    <country name="Panama">
        <rank>68</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
</data>

You can do the following

>>> import xml.etree.ElementTree as ET
>>> tree = ET.parse('test.xml')
>>> for elem in tree.iter():
...     print elem
... 
<Element 'data' at 0x10b2d7b50>
<Element 'country' at 0x10b2d7b90>
<Element 'rank' at 0x10b2d7bd0>
<Element 'year' at 0x10b2d7c50>
<Element 'gdppc' at 0x10b2d7d10>
<Element 'neighbor' at 0x10b2d7e90>
<Element 'neighbor' at 0x10b2d7ed0>
<Element 'country' at 0x10b2d7f10>
<Element 'rank' at 0x10b2d7f50>
<Element 'year' at 0x10b2d7f90>
<Element 'gdppc' at 0x10b2d7fd0>
<Element 'neighbor' at 0x10b2db050>
<Element 'country' at 0x10b2db090>
<Element 'rank' at 0x10b2db0d0>
<Element 'year' at 0x10b2db110>
<Element 'gdppc' at 0x10b2db150>
<Element 'neighbor' at 0x10b2db190>
<Element 'neighbor' at 0x10b2db1d0>

0人赞添加讨论(0) 举报

别忘想泡老子

3楼-- · 2020-02-08 06:43

you can also access specific elements like this:

country= tree.findall('.//country')

then loop over range(len(country)) and access

0人赞添加讨论(0) 举报

一纸荒年 Trace。

4楼-- · 2020-02-08 06:44

Adding to Robert Christie's answer it is possible to iterate over all nodes using fromstring() by converting the Element to an ElementTree:

import xml.etree.ElementTree as ET

e = ET.ElementTree(ET.fromstring(xml_string))
for elt in e.iter():
    print "%s: '%s'" % (elt.tag, elt.text)

0人赞添加讨论(0) 举报

Anthone

5楼-- · 2020-02-08 06:44

In addition to Robert Christie's accepted answer, printing the values and tags separately is very easy:

tree = ET.parse('test.xml')
for elem in tree.iter():
    print(elem.tag, elem.text)

0人赞添加讨论(0) 举报

how to recursively iterate over XML tags in Python

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间