An efficient way of exporting 10 datasets (each ha

2020-07-22 06:57发布

问题:

I have 10 datasets in BigQuery and each dataset has 80 tables. I know I can export each table in each dataset one-by-one to google storage using console or Web UI. This is for back-up purposes. However, this would take a while.

I wonder if there is a more handy way to handle this?

回答1:

You can do it using bq command line tools (https://cloud.google.com/bigquery/bq-command-line-tool)

Following code will help you to achieve the same:

project="bigquery-project"
dataset="your-dataset"
bucket="your-bucket"

# get list of tables
tables=$(bq ls "$project:$dataset" | awk '{print $1}' | tail +3)

for table in $tables
do
    bq extract --destination_format "CSV" --compression "GZIP" "$project:$dataset.$table" "gs://$bucket/$table/$table.csv.gz" 
done

Additionally, if you want to run command in async mode replace extract command with following

    bq --nosync extract --destination_format "CSV" --compression "GZIP" "$project:$dataset.$table" "gs://$bucket/$table/$table.csv.gz" 

PS:

  • Make sure that bq command line tools is installed and it is in your environment. It comes with google cloud SDK. To verify you can run bq help