I want to write to a gs file but I don’t know the file name at compile time. Its name is based on behavior that is defined at runtime. How can I proceed?
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
回答1:
If you're using Beam Java, you can use FileIO.writeDynamic()
for this (starting with Beam 2.3 which is currently in the process of being released - but you can already use it via the version 2.3.0-SNAPSHOT
), or the older DynamicDestinations
API (available in Beam 2.2).
Example of using FileIO.writeDynamic()
to write a PCollection
of bank transactions to different paths on GCS depending on the transaction's type:
PCollection<BankTransaction> transactions = ...;
transactions.apply(
FileIO.<BankTransaction, TransactionType>writeDynamic()
.by(Transaction::getType)
.via(BankTransaction::toString, TextIO.sink())
.to("gs://bucket/myfolder/")
.withNaming(type -> defaultNaming("transactions_", ".txt"));
For an example of DynamicDestinations
use, see example code in the TextIO unit tests.
Alternatively, if you want to write each record to its own file, just use the FileSystems
API (in particular, FileSystems.create()
) from a DoFn
.
标签:
apache-beam