Exporting large file from BigQuery to Google cloud

2019-08-26 09:59发布

问题:

I have 8Gb table in BigQuery that I'm trying to export to Google Cloud Storage (GCS). If I specify url as it is, I'm getting an error

Errors:
Table gs://***.large_file.json too large to be exported to a single file. Specify a uri including a * to shard export. See 'Exporting data into one or more files' in https://cloud.google.com/bigquery/docs/exporting-data. (error code: invalid)

Okay... I'm specifying * in a file name, but it exports it in 2 files: one 7.13Gb and one ~150Mb.

UPD. I thought I should get about 8 files, 1Gb each? Am I wrong? Or what am I doing wrong?

P.S. I tried this in WebUI mode as well as using Java library.

回答1:

For files of certain size or larger, BigQuery will export to multiple GCS files - that's why it asks for the "*" glob.

Once you have multiple files in GCS, you can join them into 1 with the compose operation:

gsutil compose gs://bucket/obj1 [gs://bucket/obj2 ...] gs://bucket/composite
  • https://cloud.google.com/storage/docs/gsutil/commands/compose