Is it possible to change the region of a Google Cloud Platform Dataflow job to Europe?
I have set zone of the pipeline to europe-west1-d
but I am unable to change to region of the job itself.
I have tried to change the region in the pipeline options, but that results in an error and only the default region is working.
pipeline_options.view_as(GoogleCloudOptions).region = 'europe-west1'
"error": {
"code": 400,
"message": "(ff50231266257fc7): The workflow could not be created, since it was sent to an invalid or unreleased region. Please resubmit with a valid region.",
"status": "INVALID_ARGUMENT"
}
europe-west1
is listed when using the command gcloud compute regions list
Yes, Cloud Dataflow Regional Endpoints allow you to change the region of a Dataflow job to Europe.
Regional Endpoints are a brand new Cloud Dataflow feature. Prior to the release of Regional Endpoints, the experimental region
option could be specified but was not used. This error message appeared because the region
option was being specified before the feature was released.
Examples for your case (Europe):
You can submit a job with only the Regional Endpoint specified, (e.g. region
= europe-west1
), and that job will be managed and run in the europe-west1
region; Cloud Dataflow will automatically select a zone for Dataflow workers, from this region, when you omit a zone.
You can also submit a job with both a Regional Endpoint and Zone specified (e.g. region
= europe-west1
and zone
= europe-west1d
), and that job will be managed in the europe-west1
region, with Dataflow workers running in the europe-west1d
zone.
With datafkow sdk 2.1.0 you can do this.
You can use
pipelineOptions.setWorkerMachineType(pipelineConfigProperties.get("worker.machine.type"));
pipelineOptions.setNetwork("dataflow.network");
pipelineOptions.setUsePublicIps(false);
pipelineOptions.setZone("dataflow.zone");
pipelineOptions.setSubnetwork("dataflow.subnetwork");
pipelineOptions.setRegion("dataflow.region");
This is tested and you definitely do this in 2.1.0