Is there any HTML parser in Ruby that reads HTML document into a DOM Tree and represents HTML tags as DOM elements?
I know Nokogiri but it doesn't parse HTML into DOM tree.
Is there any HTML parser in Ruby that reads HTML document into a DOM Tree and represents HTML tags as DOM elements?
I know Nokogiri but it doesn't parse HTML into DOM tree.
Despite your remark, Nokogiri is the way to go:
doc = Nokogiri::HTML('<body><p>Hello, worlds!</body>')
It parses even invalid HTML and returns a DOM tree:
>> doc.class
=> Nokogiri::HTML::Document
>> doc.root.class
=> Nokogiri::XML::Element
>> doc.root.children.class
=> Nokogiri::XML::NodeSet
>> doc.root.children.first.content
=> "Hello, worlds!"