I know that when we implement a ParDo transform, we pick up individual elements from our data(basically separated by "\n"). But what if I have an element that occupies two lines in my file. Can I apply my own condition to pick elements according to it? Or is it always necessary to have an element in a single line?
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
回答1:
Reading of text files is controlled by TextIO
, not by ParDo
- I suppose that's what you meant. Indeed right now TextIO
splits files into 1 element per line, however there is work in progress on changing that. You can follow the work at https://issues.apache.org/jira/browse/BEAM-2802.
It would be useful for that work, if you told more about your file format, to make sure it is in scope.