How to click link in Mechanize and Nokogiri?

2020-07-18 08:19发布

问题:

I'm using Mechanize to scrape Google Wallet for Order data. I am capturing all the data from the first page, however, I need to automatically link to subsequent pages to get more info.

The #purchaseOrderPager-pagerNextButton will move to the next page so I can pick up more records to capture. The element looks like this. I need to click on it to keep going.

<a id="purchaseOrderPager-pagerNextButton" class="kd-button small right"
 href="purchaseorderlist?startTime=0&amp;...
;currentPageStart=1&amp;currentPageEnd=25&amp;inputFullText=">
<img src="https://www.gstatic.com/mc3/purchaseorder/page-right.png"></a>

However, when I try the following I get an error:

  next_page = @orders_page.search("#purchaseOrderPager-pagerNextButton")
  next_page.click

The error:

undefined method `click' for #<Nokogiri::XML::NodeSet:0x007f9019095550> (NoMethodError)

回答1:

click is a method of Mechanize class.

Try following form.

next_page = @orders_page.at("#purchaseOrderPager-pagerNextButton")
mechanize_instance.click(next_page)

NOTE Replace mechanize_instance with actual variable.



回答2:

Your one doesn't work, as #search gives Nokogiri::XML::NodeSet instance. NodeSet is a collection of nodes. But in your case it is next_page is a NodeSet collection, which holds only one element. And first will give you the Nokogiri::XML::Node, which is also an Nokogiri::XML::Element.

Write as below :

next_page = @orders_page.search("#purchaseOrderPager-pagerNextButton").first

Or better to use #at method.

next_page = @orders_page.at("#purchaseOrderPager-pagerNextButton")

Now #click is a method of Mechanize::Page::Link instance. Open the source :

# File lib/mechanize/page/link.rb, line 29
def click
  @mech.click self
end

Here is the full code :-

next_page = @orders_page.at("#purchaseOrderPager-pagerNextButton")
# mech is your Mechanize object.
next_link = Mechanize::Page::Link.new( next_page, mech, @orders_page )
next_link.click

Mechanize#click lets you supply a string with the text of the anchor/ button to click on and Nokogiri::XML::Node as well. So we can do :

mech.click next_page

Let's see why the above code would work :

source code lines

  referer = current_page()
  href = link.respond_to?(:href) ? link.href :
    (link['href'] || link['src'])
  get href, [], referer