Edit XML with python

2019-07-30 01:35发布

问题:

I am trying to parse a xml file where I had wanted to grab the string of objlocation and change the contents of the string.

This are the contents of the xml files I have:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<publish show="STATE">

    <pubgroup objtype="ELE" location="/user_data/STATE/ITEM/character/ANM/ANM_rig_WALK_sg_v001/ANM_rig_WALK_sg_v001.xml">

        <member objidx="15283942" objlabel="anm" objlocation="/user_data/STATE/ITEM/character/ANM/ANM_rig_WALK_sg_v001/ANM_rig_WALK_sg_v001.anm"/>

        <member objidx="15283952" objlabel="fbx" objlocation="/user_data/STATE/ITEM/character/ANM/ANM_rig_WALK_sg_v001/ANM_rig_WALK_sg_v001_M_WALK_None.fbx"/>

        <member objidx="15283962" objlabel="mov" objlocation="/user_data/STATE/ITEM/character/ANM/ANM_rig_WALK_sg_v001/ANM_rig_WALK_sg_v001.mov"/>

        <member objidx="15283972" objlabel="libraryinfo" objlocation="/user_data/STATE/ITEM/character/ANM/ANM_rig_WALK_sg_v001/ANM_rig_WALK_sg_v001.json"/>

        <member objidx="15283982" objlabel="thumbnail" objlocation="/user_data/STATE/ITEM/character/ANM/ANM_rig_WALK_sg_v001/ANM_rig_WALK_sg_v001.mng"/>

    </pubgroup>
</publish>

I tried .firstChild or .childNodes[], it is printing the out contents as my xml files. This is of the list of xml files that I am trying to parse where its format is about the same.

I am trying to do this is pythonic way

回答1:

You can easily modify your xml file using the ElementTree API

from xml.etree.ElementTree import parse
doc = parse('data.xml')
root = doc.getroot()
for t in root.iterfind('pubgroup/member'):
    t.attrib['objlocation'] = "spam"

doc.write('output.xml', xml_declaration=True)

The iterfind method returns a generator function instead of list which is very convenient if your have xml file is very large

Output

<?xml version='1.0' encoding='us-ascii'?>
<publish show="STATE">

    <pubgroup location="/user_data/STATE/ITEM/character/ANM/ANM_rig_WALK_sg_v001/ANM_rig_WALK_sg_v001.xml" objtype="ELE">

        <member objidx="15283942" objlabel="anm" objlocation="spam" />

        <member objidx="15283952" objlabel="fbx" objlocation="spam" />

        <member objidx="15283962" objlabel="mov" objlocation="spam" />

        <member objidx="15283972" objlabel="libraryinfo" objlocation="spam" />

    <member objidx="15283982" objlabel="thumbnail" objlocation="spam" />

</pubgroup>

Here spam is objlocation new value.



回答2:

The shortest code I can suggest:

from xml.etree.ElementTree import ElementTree
tree = ElementTree()
root = tree.parse('test.txt') # root represents <publish> tag

for member in root.findall('pubgroup/member'):
    print member.attrib['objlocation']

Output:

/user_data/STATE/ITEM/character/ANM/ANM_rig_WALK_sg_v001/ANM_rig_WALK_sg_v001.anm
/user_data/STATE/ITEM/character/ANM/ANM_rig_WALK_sg_v001/ANM_rig_WALK_sg_v001_M_WALK_None.fbx
/user_data/STATE/ITEM/character/ANM/ANM_rig_WALK_sg_v001/ANM_rig_WALK_sg_v001.mov
/user_data/STATE/ITEM/character/ANM/ANM_rig_WALK_sg_v001/ANM_rig_WALK_sg_v001.json
/user_data/STATE/ITEM/character/ANM/ANM_rig_WALK_sg_v001/ANM_rig_WALK_sg_v001.mng

To make changes:

for member in root.findall('pubgroup/member'):
    member.attrib['objlocation'] = 'changed'
tree.write('output.txt')