I have a PCollection<String>
in Google Cloud DataFlow and I'm outputting it to text files via TextIO.Write.to
:
PCollection<String> lines = ...;
lines.apply(TextIO.Write.to("gs://bucket/output.txt"));
Currently the lines of each shard of output are in random order.
Is it possible to get Dataflow to output the lines in sorted order?