google-cloud-dataflow : How to read data from a Da

2019-09-16 08:50发布

I need to setup a data pipeline from some source databases like Oracle, MySQL and load the data to BigQuery.

How can I use google-cloud-dataflow to read data from a database(jdbc connection) and write to BigQuery tables using Python.

Also, I have some hive tables in an on-premise Hadoop cluster, how do I transfer this data to BigQuery.

I couldn't find the right documentation or examples to achieve this. Can you please point me in the right direction.

1条回答
祖国的老花朵
2楼-- · 2019-09-16 09:17

I applied a solution in my project to provide such thing, you need to follow these steps:

  1. Load data from Google Cloud SQL to Google Cloud storage in CSV by following this link.

  2. Load the CSV data from Google cloud storage directly into BigQuery by following this link.

查看更多
登录 后发表回答