What would be the best way to debug memory issues of a dataflow job?
My job was failing with a GC OOM error, but when I profile it locally I cannot reproduce the exact scenarios and data volumes.
I'm running it now on 'n1-highmem-4' machines, and I don't see the error anymore, but the job is very slow, so obviously using machine with more RAM is not the solution :)
Thanks for any advice, G