Datalab does not populate bigQuery tables

2019-08-15 03:36发布

Hi I have a problem while using ipython notebooks on datalab.

I want to write the result of a table into a bigQuery table but it does not work and anyone says to use the insert_data(dataframe) function but it does not populate my table. To simplify the problem I try to read a table and write it to a just created table (with the same schema) but it does not work. Can anyone tell me where I am wrong?

import gcp
import gcp.bigquery as bq

#read the data
df = bq.Query('SELECT 1 as a, 2 as b FROM [publicdata:samples.wikipedia] LIMIT 3').to_dataframe()

#creation of a dataset and extraction of the schema
dataset = bq.DataSet('prova1')
dataset.create(friendly_name='aaa', description='bbb')
schema = bq.Schema.from_dataframe(df)

#creation of the table
temptable = bq.Table('prova1.prova2').create(schema=schema, overwrite=True)

#I try to put the same data into the temptable just created
temptable.insert_data(df)

1条回答
戒情不戒烟
2楼-- · 2019-08-15 04:14

Calling insert_data will do a HTTP POST and return once that is done. However, it can take some time for the data to show up in the BQ table (up to several minutes). Try wait a while before using the table. We may be able to address this in a future update, see this

The hacky way to block until ready right now should be something like:

import time
while True:
  info = temptable._api.tables_get(temptable._name_parts)
  if 'streamingBuffer' not in info:
    break
  if info['streamingBuffer']['estimatedRows'] > 0:
    break
  time.sleep(5)
查看更多
登录 后发表回答