Query Streaming Data Using --time_partitioning_fie

2019-07-23 15:29发布

问题:

I am currently exploring how to query only the streaming buffer data in tables at regular intervals for generating a performance report at near real-time and found the following StackOverflow link:

How to query for data in streaming buffer ONLY in BigQuery?

However, the current type of partition is implemented using --time_partitioning_field

Using the following query forces to query all data from the table:

SELECT * FROM `<project>.<data-set>.<time-partitioned-streaming-table>`
where <time-partitioning-field> is null

The query doesn't show any difference as ideally the peak streaming buffer is @ ~60MB per hour

Is there a way to query only the streaming data with this type of partition?

回答1:

I believe this should work (but it is legacy SQL)

#standardSQL
 CREATE TABLE test.newtable (transaction_id INT64, transaction_date DATE)
 PARTITION BY transaction_date
 OPTIONS(
   partition_expiration_days=3,
   description="a table partitioned by transaction_date"
 )

#legacySQL
select * from [test.newtable$__UNPARTITIONED__] 


回答2:

It is not possible to query streaming buffer data for partioned tables because once a specific TIMESTAMP or DATE has been defined, data is "streamed directly to the partition".

Checking the official documentation you can also find the solution for ingestion-time partitioned tables mentioned in the link you posted.