BigQuery delete on streaming table

2019-08-23 04:35发布

问题:

What is the best practice to delete rows on BigQuery streaming table? My idea is to create a daily partitioned table and then delete data on the day before partion. Do you think might work?

回答1:

I tried my solution and it works! Data can be deleted adding a where clause with _PARTITIONTIME < '%date_in_the_past%'.



回答2:

Another option (to avoid manual deletion) is to use partition expiration
You can specify partition expiration for a partitioned table (timePartitioning.expirationMs)

Number of milliseconds for which to keep the storage for a partition.

Note: When you set a table's partition expiration time, you must calculate the partition expiration based on the partition's date. For example, if the partition's date is January 3, 2018, and you set the partition expiration time at 5 days, the partition expires in on January 8, 2018 regardless of when it was last updated.