My objective is to join two tables, where the second table is normal and the first one is nested structure table. The join key is available inside the nested structure in first table. In this case, how to Join these two tables using dataflow java code. WithKeys (org.apache.beam.sdk.transforms.WithKeys) accepting direct column name and it does not allow like firstTable.columnname
. Could some one to help to solve this case.
相关问题
- Why do Dataflow steps not start?
- Apache beam DataFlow runner throwing setup error
- Apply Side input to BigQueryIO.read operation in A
- Reading BigQuery federated table as source in Data
- CloudDataflow can not use “google.cloud.datastore”
相关文章
- Apply TensorFlow Transform to transform/scale feat
- Kafka to Google Cloud Platform Dataflow ingestion
- How to run dynamic second query in google cloud da
- How do I use MapElements and KV in together in Apa
- Beam/Google Cloud Dataflow ReadFromPubsub Missing
- Cloud Dataflow failure recovery
- Difference between beam.ParDo and beam.Map in the
- KafkaIO checkpoint - how to commit offsets to Kafk
If both tables are equally large consider using the CoGroupByKey transform described here. You will have to transform your data into two PCollections keyed by the proper key before this operation.
If one table is significantly smaller than the other, feeding the smaller PCollection as a side input to a ParDo over the larger PCollection as described here might be a better option.