Using google cloud dataflow PubSubIO, when does th

2019-02-20 14:52发布

Is it possible to delay acknowledgement until the subgraph (everything below the PubSubIO.Read) is successfully processed?

For example, we are streaming reads from a google pubsub subscription and then writing a file to GCS and in another branch we are writing to BigQuery using BigQueryIO.Write...

We do see that if an exception occurs it will retry indefinitely, since we are in streaming mode. However, if we cancel the job and redeploy with a code change, the message is not reprocessed.

1条回答
Lonely孤独者°
2楼-- · 2019-02-20 15:14

The acknowledgement will be made once the message is durable persisted somewhere in the Dataflow pipeline. If you want to make changes to a pipeline without losing in-flight data, use the Update feature instead of Cancel: https://cloud.google.com/dataflow/pipelines/updating-a-pipeline

查看更多
登录 后发表回答