Join Nested Structure Table using Dataflow Java co

2019-08-27 07:23发布

My objective is to join two tables, where the second table is normal and the first one is nested structure table. The join key is available inside the nested structure in first table. In this case, how to Join these two tables using dataflow java code. WithKeys (org.apache.beam.sdk.transforms.WithKeys) accepting direct column name and it does not allow like firstTable.columnname. Could some one to help to solve this case. enter image description here

1条回答
做个烂人
2楼-- · 2019-08-27 08:22

If both tables are equally large consider using the CoGroupByKey transform described here. You will have to transform your data into two PCollections keyed by the proper key before this operation.

If one table is significantly smaller than the other, feeding the smaller PCollection as a side input to a ParDo over the larger PCollection as described here might be a better option.

查看更多
登录 后发表回答