My streaming dataflow job(2017-09-08_03_55_43-9675407418829265662
) using Apache Beam SDK for Java 2.1.0
will not scale past 1 Worker even with a growing pubsub queue (now 100k Undelivered messages) – do you have any ideas why?
Its currently running with autoscalingAlgorithm=THROUGHPUT_BASED
and maxNumWorkers=10
.
Dataflow Engineer here. I looked up the job in our backend and I can see that it is not scaling up because CPU utilization is low, meaning something else is limiting the performance of the pipeline, such as external throttling. Upscaling rarely helps in these cases.
I see that some bundles are taking up to hours to process. I recommend investigating your pipeline logic and see if there are other parts that can be optimized.
This is what I ended up with:
What do you think @raghu-angadi and @scott-wegner?