I have a Google App Engine program that calls BigQuery for data.
The query usually takes 3 - 4.5 seconds and is fine but sometimes takes over five seconds and throws this error:
DeadlineExceededError: The API call urlfetch.Fetch() took too long to respond and was cancelled.
This article shows the deadlines and the different kinds of deadline errors.
Is there a way to set the deadline for a BigQuery job to be above 5 seconds? Could not find it in the BigQuery API docs.
BigQuery queries are fast, but often take longer than the default App Engine urlfetch timeout. The BigQuery API is async, so you need to break up the steps into API calls that each are shorter than 5 seconds.
For this situation, I would use the App Engine Task Queue:
Make a call to the BigQuery API to insert your job. This returns a JobID.
Place a task on the App Engine task queue to check out the status of the BigQuery query job at that ID.
If the BigQuery Job Status is not "DONE", place a new task on the queue to check it again.
If the Status is "DONE," then make a call using urlfetch to retrieve the results.
This is one way to solve bigquery timeouts in AppEngine for Go. Simply set
TimeoutMs
on your queries to well below 5000. The default timeout for bigquery queries is 10000ms which is over the default 5 second deadline for outgoing requests in AppEngine.The gotcha is that the timeout must be set both in the initial request:
bigquery.service.Jobs.Query(…)
and the subsequentb.service.Jobs.GetQueryResults(…)
which you use to poll the query results.Example:
The nice thing about this is that you maintain the default request deadline for the overall request (60s for normal requests and 10min for tasks and cronjobs) while avoiding setting the deadline for outgoing requests to some arbitrary large value.
To issue HTTP requests in AppEngine you can use
urllib
,urllib2
,httplib
, orurlfetch
. However, no matter what library you choose, AppEngine will perform HTTP requests using App Engine's URL Fetch service.The
googleapiclient
useshttplib2
. It looks likehttplib2.Http
passes it's timeout to urlfetch. Since it has a default value of None, urlfetch sets the deadline of that request to 5s no matter what you set with urlfetch.set_default_fetch_deadline.Under the covers httplib2 uses the
socket
library for HTTP requests.To set the timeout you can do the following:
You should also be able to do this but I haven't tested it:
If you don't have existing code to time the request you can wrap your query like so:
Note I would go with Michael's suggestion since that is the most robust. I just wanted to point out that you can increase the urlfetch timeout up to 60 seconds, which should be enough time for most queries to complete.
How to set timeout for urlfetch in Google App Engine?
I was unable to get the
urlfetch.set_default_fetch_deadline()
method to apply to the Big Query API, but was able to increase the timeout when authorizing the big query session as follows:Or with an asynchronous approach using
jobs().insert
We ended up going with an approach similar to what Michael suggests above, however even when using the asynchronous call, the
getQueryResults
method (paginated with a smallmaxResults
parameter) was timing out on url fetch, throwing the error posted in the question.So, in order to increase the timeout of URL Fetch in Big Query / App Engine, set the timeout accordingly when authorizing your session.