I am trying to do batch searching and go over a list of strings and print the first address that google search returns:
#!/usr/bin/python
import json
import urllib
import time
import pandas as pd
df = pd.read_csv("test.csv")
saved_column = df.Name #you can also use df['column_name']
for name in saved_column:
query = urllib.urlencode({'q': name})
url = 'http://ajax.googleapis.com/ajax/services/search/web?v=1.0&%s' % query
search_response = urllib.urlopen(url)
search_results = search_response.read()
results = json.loads(search_results)
data = results['responseData']
address = data[u'results'][0][u'url']
print address
I get a 403 error from the server: 'Suspected Terms of Service Abuse. Please see http://code.google.com/apis/errors', u'responseStatus': 403
Is what I'm doing is not allowed according to google's terms of service?
I also tried to put time.sleep(5) in the loop but I get the same error.
Thank you in advance
Not allowed by Google TOS. You really can't scrape google without them getting angry. It's also a pretty sophisticated blocker, so you can get around for a little while with random delays, but it fails pretty quickly.
Sorry, you're out of luck on this one.
https://developers.google.com/errors/?csw=1
Also