I have a bunch of files in Azure Blob storage and it's constantly getting new ones. I was wondering if there is a way for me to first take all the data I have in Blob and move it over to BigQuery and then keep a script or some job running so that all new data in there gets sent over to BigQuery?
相关问题
- running headless chrome in an microsoft azure web
- Docker task in Azure devops won't accept "$(pw
- Register MicroServices in Azure Active Directory (
- Removing VHD's from Azure Resource Manager aft
- Cannot use the Knowledge academic API
相关文章
- BigQuery - Concatenate multiple rows into a single
- SQL Azure Reset autoincrement
- How to cast Azure DocumentDB Document class to my
- Can't get azure web role to run locally using
- Azure WebApp - Unable to auto-detect the runtime s
- How to change region for Azure WebSite
- Azure webjob vs cloud service
- Azure data transfer Identity Column Seed Jumped by
BigQuery offers support for querying data directly from these external data sources: Google Cloud Bigtable, Google Cloud Storage, Google Drive. Not include Azure Blob storage. As Adam Lydick mentioned, as a workaround, you could copy data/files from Azure Blob storage to Google Cloud Storage (or other BigQuery-support external data sources).
To copy data from Azure Blob storage to Google Cloud Storage, you can run WebJobs (or Azure Functions), and BlobTriggerred WebJob can trigger a function when a blob is created or updated, in WebJob function you can access the blob content and write/upload it to Google Cloud Storage.
Note: we can install this library: Google.Cloud.Storage to make common operations in client code. And this blog explained how to use Google.Cloud.Storage sdk in Azure Functions.
I'm not aware of anything out-of-the-box (on Google's infrastructure) that can accomplish this.
I'd probably set up a tiny VM to:
If you used GCS instead of Azure Blob Storage, you could eliminate the VM and just have a Cloud Function that is triggered on new items being added to your GCS bucket (assuming your blob is in a form that BigQuery knows how to read). I presume this is part of an existing solution that you'd prefer not to modify though.