Google Dataflow job and BigQuery failing on differ

2019-07-10 22:12发布

问题:

I have a Google Dataflow job that is failing on:

BigQuery job ... finished with error(s): errorResult: 
Cannot read and write in different locations: source: EU, destination: US, error: Cannot read and write in different locations: source: EU, destination: US

I'm starting the job with --zone=europe-west1-b

And this is the only part of the pipeline that does anything with BigQuery:

Pipeline p = Pipeline.create(options);
p.apply(BigQueryIO.Read.fromQuery(query));

The BigQuery table I'm reading from has this in the details: Data Location EU

When I run the job locally, I get:

SEVERE: Error opening BigQuery table  dataflow_temporary_table_339775 of dataset _dataflow_temporary_dataset_744662  : 404 Not Found

I don't understand why it is trying to write to a different location if I'm only reading data. And even if it needs to create a temporary table, why is it being created in a different region?

Any ideas?

回答1:

I would suggest to verify:

  • If the staging location for the Google Dataflow is in the same zone.
  • If Google Cloud Storage location used in Dataflow is also the in same zone.