Having trouble determining if element exists

2019-08-02 11:19发布

问题:

I have an xml document full of nested item nodes. In most cases, each item has a name element. I want to check if an item has a name element, and return a default name if one doesn't exist.

<item>
  <name>Item 1</name>
</item>
<item>
    <items>
        <item>
          <name>Child Item 1</name>
        </item>
        <item>
          <name>Child Item 2</name>
        </item>
    </items>
</item>

When I ask node.at('name') for the node with no name element, it picks the next one from the children further down the tree. In the case above, if I ask at('name') on the second item, I get "Child Item 1".

回答1:

The problem is you're using at(), which can accept either a CSS selector or an XPath expression, and tries to guess which you gave it. In this case it thinks that name is a CSS selector, which is a descendant selector, selecting name elements anywhere below the current node.

Instead, you want to use an XPath expression to find only child <name> elements. You can do this either by making it clearly an XPath expression:

node.at('./name')

or you can do it by using the at_xpath method to be clear:

node.at_xpath('name')

Here's a simple working example:

require 'nokogiri'
doc = Nokogiri.XML '<r>
  <item id="a">
    <name>Item 1</name>
  </item>
  <item id="b">
      <items>
          <item id="c">
            <name>Child Item 1</name>
          </item>
          <item id="d">
            <name>Child Item 2</name>
          </item>
      </items>
  </item>
</r>'

doc.css('item').each do |item|
  name = item.at_xpath('name')
  name = name ? name.text : "DEFAULT"
  puts "#{item['id']} -- #{name}"
end

#=> a -- Item 1
#=> b -- DEFAULT
#=> c -- Child Item 1
#=> d -- Child Item 2