My application:
- STEP1: fetches BQ job_id pulished to a pubsusb topic.
- STEP2: checks if the job.state is DONE or not (most times its’s already DONE!)
- STEP3: if the job is finished, fetches the query results (60K rows x50 columns, all floating digits)
- STEP4: processes query result data and uploads to Google Analytics (via Data Import API)
Locally on my PC all works fine. However, when running on AppEngine Flexi environment (Python3 and using CLinet Libraries) I’m having problems on Step3 fetching the query results.
I tried 3 ways and from logs I can see it’s these steps where the application gets stuck and returns 502 BAD Gateway:
ver1:
df = job.to_dataframe() //STUCK here
ver2:
result = job.result() // SUCCESS
rows = list(result) //STUCK here
ver3:
`schema = [r.name for r in result.schema] // SUCCESS
`print("schema fetched: ",schema) // SUCCESS
`pages = [page for page in result.pages] //STUCK here
`print("pages fetched: ",len(pages))
Stackdriver gives following logs:
[error] 32#32: *7316 upstream prematurely closed connection while reading response header from upstream, client: 216.XX.YY.ZZ, server: , request: "GET /ml_models/dev_poll_results/ HTTP/1.1", upstream: "http://172.17.0.1:8080/ml_models/dev_poll_results/", host: "serato-big-query.appspot.com"
Any help, any guidance will be highly appreciable pls.
UPDATE
From Stackdriver Logging I see BQ does get datasets:tables:data reques. Oddly, protoPayload.serviceData.tableDataListRequest
field is empty. Usually it's populated with maxResults
startRow
field. Makes meeting feel, could there be an issue with communicating large results between BQ and AppEngine app