Why doesn't Nokogiri xpath like xmlns declarat

2019-01-22 12:27发布

I'm using Nokogiri::XML to parse responses from Amazon SimpleDB. The response is something like:

<SelectResponse xmlns="http://sdb.amazonaws.com/doc/2007-11-07/">
  <SelectResult>
    <Item>
      <Attribute><Name>Foo</Name><Value>42</Value></Attribute>
      <Attribute><Name>Bar</Name><Value>XYZ</Value></Attribute>
    </Item>
  </SelectResult>
</SelectResponse>

If I just hand the response straight over to Nokogiri, all XPath queries (e.g. doc/"//Item/Attribute[Name='Foo']/Value") return an empty array. But if I remove the xmlns attribute from the SelectResponse tag, it works perfectly.

Is there some extra thing I need to do to account for the namespace declaration? This workaround feels horribly like a hack.

2条回答
Lonely孤独者°
2楼-- · 2019-01-22 12:54

That XPath query looks for elements that are not in any namespace. You need to tell your XPath processor that you are looking for elements in namespace http://sdb.amazonaws.com/doc/2007-11-07/

One way to do that with nokogiri is this:

doc = Nokogiri::XML.parse(...)
doc.xpath("//aws:Item/aws:Attribute[Name='Foo']/aws:Value", {"aws" => "http://sdb.amazonaws.com/doc/2007-11-07/"})
查看更多
该账号已被封号
3楼-- · 2019-01-22 12:55

I found this really helpful in understanding what's going on: http://tenderlovemaking.com/2009/04/23/namespaces-in-xml.html

Basically if you have a namespace defined at all (via xmlns=), you must use a namespace in your xpath searches.

So in your case, you could do one of three things:

  • Remove the xmlns attribute from the root SearchResponse. In that case your original, namespace-less xpath query will work.
  • Use the default namespace in your xpath query doc/"//xmlns:Item/xmlns:Attribute[xmlns:Name='Foo']/xmlns:Value"
  • Define a custom namespace in the second argument of the xpath method call and use that in your query, as shown in hrnt's solution above
查看更多
登录 后发表回答