Python: Convert XML to CSV file

2020-05-24 05:32发布

I have an XML file like this:

<hierachy>
    <att>
        <Order>1</Order>
        <attval>Data</attval>
        <children>
            <att>
                <Order>1</Order>
                <attval>Studyval</attval>
            </att>
            <att>
                <Order>2</Order>
                <attval>Site</attval>
            </att>
        </children>
    </att>
    <att>
        <Order>2</Order>
        <attval>Info</attval>
        <children>
            <att>
                <Order>1</Order>
                <attval>age</attval>
            </att>
            <att>
                <Order>2</Order>
                <attval>gender</attval>
            </att>
        </children>
    </att>
</hierachy>

I'm trying to convert it to a CSV file like this:

Data,Studyval
Date,Site
Info,age
Info,gender

My problem is, both the parent and child names are the same- 'att' and 'attval'. How do I tell Python to distinguish between the both and give me the output?

I tried this:

import xml.etree.cElementTree as ET

tree = ET.parse('input.xml')
rebase = tree.getroot()

list = []

for att in rebase.findall('att'):
        name = att.find('attval').text
        for each_att in att.findall('attval'):
            try:
                val = att.find('attval').text
                print name, val
            except AttributeError:
                print name

and it printed the same things twice.

1条回答
时光不老,我们不散
2楼-- · 2020-05-24 06:06

Do not use the findall function, as it will look for att tags in the whole tree. Just iterate the tree in order from top to bottom and grab the relevant elements in them.

from xml.etree import ElementTree
tree = ElementTree.parse('input.xml')
root = tree.getroot()

for att in root:
    first = att.find('attval').text
    for subatt in att.find('children'):
        second = subatt.find('attval').text
        print('{},{}'.format(first, second))

Which gives:

$ python process.py 
Data,Studyval
Data,Site
Info,age
Info,gender
查看更多
登录 后发表回答