Paging on Google Places API returns status INVALID

2020-04-03 07:27发布

问题:

I'm using the Google Place API for place search:

https://developers.google.com/places/documentation/search

After the first query of the api, I'm getting the next page by setting the pagetoken. If I wait 2 seconds between requests, it works, but I notice that if I make the next query right after the previous one, it returns the status INVALID_REQUEST.

Is this some kind of rate limiting? I don't see this anywhere in the documentation.

https://developers.google.com/places/usage

Since each request has 20 places, getting a list of 100 results will take over 10 seconds which is a long time for someone to wait using an app.

回答1:

It is documented, see the documentation

By default, each Nearby Search or Text Search returns up to 20 establishment results per query; however, each search can return as many as 60 results, split across three pages. If your search will return more than 20, then the search response will include an additional value — next_page_token. Pass the value of the next_page_token to the pagetoken parameter of a new search to see the next set of results. If the next_page_token is null, or is not returned, then there are no further results. There is a short delay between when a next_page_token is issued, and when it will become valid. Requesting the next page before it is available will return an INVALID_REQUEST response. Retrying the request with the same next_page_token will return the next page of results.



回答2:

Although I'm not 100% sure this is the cause, I will leave this answer here as it took me around 6 hours to figure this out and may help someone.

As pointed out by geocodezip in his answer, there is a slight delay between the next page token being returned and that page actually being available. So I didn't find any other way to fix it other than making use of some sort of sleep.

But what I did find out is that every request after the first INVALID_REQUEST response were also giving an INVALID_REQUEST response, no matter if I waited 1, 2 or 10 seconds.

I suspect it has something to do with a cached response from Google's Part. The solution I found is to append a random incremental new parameter to the URL as to make it a "different" URL, therefore requesting a cacheless response.

The parameter I used was request_count and for every request made, I incremented it by 1.

For illustration purpose, here's the Python POC code I used (do not copy-paste as this is just a POC snippet and won't work):

# If except raises, it's most likely due to an invalid api key
# The while True is used to keep trying a new key each time
query_result_next_page = None
google_places = GooglePlaces(api_key)
invalid_requests_found = 0
request_count = 0
while True:

    request_count = request_count + 1

    try:
        query_result = google_places.nearby_search(
                lat_lng={'lat': event['latitude'], 'lng': event['longitude']}, 
                radius=event['radius'],
                pagetoken=query_result_next_page,
                request_count=request_count)

        # If there are additional result pages, lets get it on the next while step
        if query_result.has_next_page_token:
            query_result_next_page = query_result.next_page_token
        else:
            break

    except Exception as e:
        # If the key is over the query limit, try a new one
        if str(e).find('OVER_QUERY_LIMIT') != -1:
            logInfo("Key "+api_key+" unavailable.")
            self.set_unavailable_apikey(api_key)
            api_key = self.get_api_key()
        # Sometimes the Places API doesn't create the next page
        # despite having a next_page_key and throws an INVALID_REQUEST.
        # We should just sleep for a bit and try again.
        elif str(e).find('INVALID_REQUEST') != -1:
            # Maximum of 4 INVALID_REQUEST responses
            invalid_requests_found = invalid_requests_found + 1
            if invalid_requests_found > 4:
                raise e

            time.sleep(1)
            continue
        # If it is another error, different from zero results, raises an exception
        elif str(e).find('ZERO_RESULTS') == -1:
            raise e
        else:
            break

EDIT: Forgot to mention that the GooglePlaces object is from slimkrazy's Google API lib. Unfortunatelly I had to tweak the code of the actual lib to accept this new request_count parameter.

I had to replace the nearby_search method for this:

def nearby_search(self, language=lang.ENGLISH, keyword=None, location=None,
           lat_lng=None, name=None, radius=3200, rankby=ranking.PROMINENCE,
           sensor=False, type=None, types=[], pagetoken=None, request_count=0):
    """Perform a nearby search using the Google Places API.

    One of either location, lat_lng or pagetoken are required, the rest of 
    the keyword arguments are optional.

    keyword arguments:
    keyword  -- A term to be matched against all available fields, including
                but not limited to name, type, and address (default None)
    location -- A human readable location, e.g 'London, England'
                (default None)
    language -- The language code, indicating in which language the
                results should be returned, if possible. (default lang.ENGLISH)
    lat_lng  -- A dict containing the following keys: lat, lng
                (default None)
    name     -- A term to be matched against the names of the Places.
                Results will be restricted to those containing the passed
                name value. (default None)
    radius   -- The radius (in meters) around the location/lat_lng to
                restrict the search to. The maximum is 50000 meters.
                (default 3200)
    rankby   -- Specifies the order in which results are listed :
                ranking.PROMINENCE (default) or ranking.DISTANCE
                (imply no radius argument).
    sensor   -- Indicates whether or not the Place request came from a
                device using a location sensor (default False).
    type     -- Optional type param used to indicate place category.
    types    -- An optional list of types, restricting the results to
                Places (default []). If there is only one item the request
                will be send as type param.
    pagetoken-- Optional parameter to force the search result to return the next
                20 results from a previously run search. Setting this parameter 
                will execute a search with the same parameters used previously. 
                (default None)
    """
    if location is None and lat_lng is None and pagetoken is None:
        raise ValueError('One of location, lat_lng or pagetoken must be passed in.')
    if rankby == 'distance':
        # As per API docs rankby == distance:
        #  One or more of keyword, name, or types is required.
        if keyword is None and types == [] and name is None:
            raise ValueError('When rankby = googleplaces.ranking.DISTANCE, ' +
                             'name, keyword or types kwargs ' +
                             'must be specified.')
    self._sensor = sensor
    radius = (radius if radius <= GooglePlaces.MAXIMUM_SEARCH_RADIUS
              else GooglePlaces.MAXIMUM_SEARCH_RADIUS)
    lat_lng_str = self._generate_lat_lng_string(lat_lng, location)
    self._request_params = {'location': lat_lng_str}
    if rankby == 'prominence':
        self._request_params['radius'] = radius
    else:
        self._request_params['rankby'] = rankby
    if type:
        self._request_params['type'] = type
    elif types:
        if len(types) == 1:
            self._request_params['type'] = types[0]
        elif len(types) > 1:
            self._request_params['types'] = '|'.join(types)
    if keyword is not None:
        self._request_params['keyword'] = keyword
    if name is not None:
        self._request_params['name'] = name
    if pagetoken is not None:
        self._request_params['pagetoken'] = pagetoken
    if language is not None:
        self._request_params['language'] = language
    self._request_params['request_count'] = request_count
    self._add_required_param_keys()
    url, places_response = _fetch_remote_json(
            GooglePlaces.NEARBY_SEARCH_API_URL, self._request_params)
    _validate_response(url, places_response)
    return GooglePlacesSearchResult(self, places_response)