I want to get all the search results for a particular keyword search on google. I've seen suggestions of scraping, but this seems like a bad idea. I've seen Gems (I plan on using ruby) that do scraping and use the API. I've also seen suggestions of using the API.
Does anyone know the best way to do this right now? The API Is no longer supported and I've seen people report they get unusable data back. Do the Gems help solve this or no?
Thanks in advance.
According to http://code.google.com/apis/websearch/ , the Search API has been deprecated -- but there's a replacement, the Custom Search API. Will that do what you want?
If so, a quick Web search turned up https://github.com/alexreisner/google_custom_search , among other gems.
You will eventually get 503 errors if you are running a scraper on a google search result page. A more scalable (and legal) approach is to use the Google's Custom Search API.
The API provides 100 search queries per day for free. If you need more, you may sign up for billing in the Google Developers Console. Additional requests cost $5 per 1000 queries, up to 10k queries per day.
The example below get's Google search results in JSON format:
The Custom Search API most likely is not what you're looking for. I'm pretty sure you have to set up a Custom Search engine which you use the API to query, and this can only search over a user-specified set of domains (i.e. you can't perform general web search).
If you need to perform a general Google search, then scraping is currently the only way to go. It's quite easy to write ruby code to perform Google searches and scrape the search results URLs (I did this myself for a summer research project), but it does violate Google's TOS, so be warned.
I also go for the scrape option, its quicker than asking google for a key and plus, and you are not limited to 100 search queries per day. Google´s TOS is an issue though, as Richard points out. Here´s an example i´ve done that works for me - it´s also useful if you want to connect through a proxy:
Use the Google Custom Search API:
http://code.google.com/apis/customsearch/v1/overview.html
You can also use our API. We take care of the hard parts of scrapping and parsing Google search results. We have bindings available in Ruby as simple as:
Repository: https://github.com/serpapi/google-search-results-ruby