Following a link using Nokogiri for scraping

2019-04-13 12:40发布

问题:

Is there a method to follow a link using Nokogiri for scraping? I know I can extract the href and open it, but I thought I saw a method to do this using hpricot and was wondering if there was something like that in Nokogiri.

回答1:

Here is an excellent screen scraping guide for using Ruby, Nokigiri, Hpricot, and Firebug.

Personally I am a big fan of using Mechanize, which is a headless browser, for screen scraping. You can use mechanize to navigate links and fill out forms and it will handle all the tricky stuff like cookies.