Getting text values from XML in Python

2020-03-24 05:38发布

from xml.dom.minidom import parseString
dom = parseString(data)
data = dom.getElementsByTagName('data')

the 'data' variable returns as an element object but I cant for the life of me see in the documentation to grab the text value of the element.

For example:

<something><data>I WANT THIS</data></something>

Anyone have any ideas?

2条回答
\"骚年 ilove
2楼-- · 2020-03-24 05:46

This should do the trick:

dom = parseString('<something><data>I WANT THIS</data></something>')
data = dom.getElementsByTagName('data')[0].childNodes[0].data

i.e. you need to wade deeper into the DOM structure to get at the text child node and then access its value.

查看更多
Rolldiameter
3楼-- · 2020-03-24 05:57

So the way to look at it is that "I WANT THIS" is actually another node. It's a text child of "data".

from xml.dom.minidom import parseString
dom = parseString(data)
nodes = dom.getElementsByTagName('data')

At this point, "nodes" is a NodeList and in your example, it has one item in it which is the "data" element. Correspondingly the "data" element also only has one child which is a text node "I WANT THIS".

So you could just do something like this:

print nodes[0].firstChild.nodeValue

Note that in the case where you have more than one tag called "data" in your input, you should use some sort of iteration technique on "nodes" rather than index it directly.

查看更多
登录 后发表回答