引入nokogiri文本节点的内容(Nokogiri text node contents)

有没有干净的方式来获得文本节点的内容与引入nokogiri？现在，我使用

some_node.at_xpath( "//whatever" ).first.content

这似乎只是获取文本非常啰嗦。

Answer 1:

你想只有文字？

doc.search('//text()').map(&:text)

也许你不希望所有的空格和噪音。如果你想只包含文字字符的文本节点，

doc.search('//text()').map(&:text).delete_if{|x| x !~ /\w/}

编辑：看来你只想要一个单一节点的文本内容：

some_node.at_xpath( "//whatever" ).text

Answer 2:

只要看看文本节点：

require 'nokogiri'

doc = Nokogiri::HTML(<<EOT)
<html>
<body>
<p>This is a text node </p>
<p> This is another text node</p>
</body>
</html>
EOT

doc.search('//text()').each do |t|
  t.replace(t.content.strip)
end

puts doc.to_html

其输出：

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body>
<p>This is a text node</p>
<p>This is another text node</p>
</body></html>

顺便说一句，你的代码示例不起作用。 at_xpath( "//whatever" ).first是多余的，会失败。 at_xpath会发现只有第一次出现，返回一个节点。 first是多余的，在这一点上，它是否会工作，但不会因为节点不具有first方法。

我有<data><foo>bar</foo></bar> ，我如何才能在“酒吧”的文本，而不做doc.xpath_at( "//data/foo" ).children.first.content ？

假设doc包含解析的DOM：

doc.to_xml # => "<?xml version=\"1.0\"?>\n<data>\n  <foo>bar</foo>\n</data>\n"

获得第一次出现：

doc.at('foo').text       # => "bar"
doc.at('//foo').text     # => "bar"
doc.at('/data/foo').text # => "bar"

获取所有事件，并采取第一种：

doc.search('foo').first.text      # => "bar"
doc.search('//foo').first.text    # => "bar"
doc.search('data foo').first.text # => "bar"

引入nokogiri文本节点的内容(Nokogiri text node contents)

Answer 1:

Answer 2:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮