Big Query - Group By Clause not working with NEST(

2019-08-08 06:37发布

问题:

Here is the reproduction on their public test data -

SELECT corpus, NEST(word) 
FROM [publicdata:samples.shakespeare] 
GROUP BY corpus 
LIMIT 1000

Row corpus f0_
1 1kinghenryiv brave
2 1kinghenryiv profession
3 1kinghenryiv treason

Can someone tell me what I am doing wrong ?

回答1:

Nothing wrong
Per https://cloud.google.com/bigquery/query-reference#aggfunctions

BigQuery automatically flattens query results, so if you use the NEST function on the top level query, the results won't contain repeated fields. Use the NEST function when using a subselect that produces intermediate results for immediate use by the same query.

Number of returning rows proves this (1000 in query - but 41852 in result, because it is flattened:

You also can run below query to see that NEST() actually works:

SELECT corpus, COUNT(1) AS cnt 
FROM (
  SELECT corpus, NEST(word) 
  FROM [publicdata:samples.shakespeare] 
  GROUP BY corpus 
  LIMIT 1000
)
GROUP BY corpus