Getting text values from XML in Python

2020-03-24 05:59发布

问题:

from xml.dom.minidom import parseString
dom = parseString(data)
data = dom.getElementsByTagName('data')

the 'data' variable returns as an element object but I cant for the life of me see in the documentation to grab the text value of the element.

For example:

<something><data>I WANT THIS</data></something>

Anyone have any ideas?

回答1:

This should do the trick:

dom = parseString('<something><data>I WANT THIS</data></something>')
data = dom.getElementsByTagName('data')[0].childNodes[0].data

i.e. you need to wade deeper into the DOM structure to get at the text child node and then access its value.



回答2:

So the way to look at it is that "I WANT THIS" is actually another node. It's a text child of "data".

from xml.dom.minidom import parseString
dom = parseString(data)
nodes = dom.getElementsByTagName('data')

At this point, "nodes" is a NodeList and in your example, it has one item in it which is the "data" element. Correspondingly the "data" element also only has one child which is a text node "I WANT THIS".

So you could just do something like this:

print nodes[0].firstChild.nodeValue

Note that in the case where you have more than one tag called "data" in your input, you should use some sort of iteration technique on "nodes" rather than index it directly.