I have this XML file and I want to get the country nodes which have the pattern 'in' in their name.
<?xml version="1.0"?>
<data>
<country name="Liechtenstein">
<rank>1</rank>
<year>2008</year>
<gdppc>141100</gdppc>
<neighbor name="Austria" direction="E"/>
<neighbor name="Switzerland" direction="W"/>
</country>
<country name="Singapore">
<rank>4</rank>
<year>2011</year>
<gdppc>59900</gdppc>
<neighbor name="Malaysia" direction="N"/>
</country>
<country name="Panama">
<rank>68</rank>
<year>2011</year>
<gdppc>13600</gdppc>
<neighbor name="Costa Rica" direction="W"/>
<neighbor name="Colombia" direction="E"/>
</country>
</data>
I have tried this
import xml.etree.ElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot()
list=root.find(".//country[contains(@name, 'Pana')]")
But I am getting an error : SyntaxError: invalid predicate
Could anyone please help how to solve this?
I cannot comment on why your original code does not work, but is has nothing to do with the XPath expression. The expression is fine, except for the leading
.
which you can safely omit.Any reason you are not using the lxml xpath() method?
gives back a
country
element:The xml parser you are using does not support
contains
. You will need to use a different parser for full xpath supporthttps://docs.python.org/2/library/xml.etree.elementtree.html#elementtree-xpath
xml.etree.ElementTree
provides only limited support for XPath expressions for locating elements in a tree, and that doesn't include xpathcontains()
function. See the documentation for list of supported xpath syntax.You need to resort to a library that provide better xpath support, like lxml, or use simpler xpath and do further filtering manually, for example :