like DOMDocument class in PHP, is there any class in RUBY (i.e the core RUBY), to parse and get node elements value from a HTML Document.
相关问题
- How to specify memcache server to Rack::Session::M
- Why am I getting a “C compiler cannot create execu
- reference to a method?
- ruby 1.9 wrong file encoding on windows
- gem cleanup shows error: Unable to uninstall bundl
相关文章
- Ruby using wrong version of openssl
- Difference between Thread#run and Thread#wakeup?
- how to call a active record named scope with a str
- “No explicit conversion of Symbol into String” for
- Segmentation fault with ruby 2.0.0p247 leading to
- How to detect if an element exists in Watir
- uninitialized constant Mysql2::Client::SECURE_CONN
- ruby - simplify string multiply concatenation
You should check out hpricot. It's exceedingly good. It's not 'core' ruby, but it's a commonly used gem.
There is no built-in HTML parser (yet), but some very good ones are available, in particular Nokogiri.
Meta-answer: For common needs like these, I'd recommend checking out the Ruby Toolbox site. You'll notice that Nokogiri is the top recommendation for HTML parsers
Ruby Cheerio - A jQuery style HTML parser in ruby. A most simplified version of Nokogiri for crawlers. This is the ruby version of most popular NodeJS package cheerio.
Follow the link for a simple crawler example.
gem install ruby-cheerio
You can also try Oga by Yorick Peterse.
It is an XML/HTML parser written in Ruby that does not require system libraries such as libxml. You can find it here. https://github.com/YorickPeterse/oga