Method to parse HTML document in Ruby?

2019-01-06 19:40发布

like DOMDocument class in PHP, is there any class in RUBY (i.e the core RUBY), to parse and get node elements value from a HTML Document.

标签： ruby html-parser

4条回答

一夜七次

2楼-- · 2019-01-06 20:02

You should check out hpricot. It's exceedingly good. It's not 'core' ruby, but it's a commonly used gem.

0人赞添加讨论(0) 举报

霸刀☆藐视天下

3楼-- · 2019-01-06 20:12

There is no built-in HTML parser (yet), but some very good ones are available, in particular Nokogiri.

Meta-answer: For common needs like these, I'd recommend checking out the Ruby Toolbox site. You'll notice that Nokogiri is the top recommendation for HTML parsers

0人赞添加讨论(0) 举报

乱世女痞

4楼-- · 2019-01-06 20:17

Ruby Cheerio - A jQuery style HTML parser in ruby. A most simplified version of Nokogiri for crawlers. This is the ruby version of most popular NodeJS package cheerio.

Follow the link for a simple crawler example.

gem install ruby-cheerio

require 'ruby-cheerio'

jQuery = RubyCheerio.new("<html><body><h1 class='one'>h1_1</h1><h1>h1_2</h1></body></html>")

jQuery.find('h1').each do |head_one|
    p head_one.text
end

# getting attribute values like jQuery.
p jQuery.find('h1.one')[0].prop('h1','class')

# function chaining similar to jQuery.
p jQuery.find('body').find('h1').first.text

0人赞添加讨论(0) 举报

\"骚年 ilove

5楼-- · 2019-01-06 20:19

You can also try Oga by Yorick Peterse.

It is an XML/HTML parser written in Ruby that does not require system libraries such as libxml. You can find it here. https://github.com/YorickPeterse/oga

0人赞添加讨论(0) 举报

Method to parse HTML document in Ruby?

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间