Here is the reproduction on their public test data -
SELECT corpus, NEST(word)
FROM [publicdata:samples.shakespeare]
GROUP BY corpus
LIMIT 1000
Row corpus f0_
1 1kinghenryiv brave
2 1kinghenryiv profession
3 1kinghenryiv treason
Can someone tell me what I am doing wrong ?
Nothing wrong
Per https://cloud.google.com/bigquery/query-reference#aggfunctions
BigQuery automatically flattens query results, so if you use the NEST
function on the top level query, the results won't contain repeated
fields. Use the NEST function when using a subselect that produces
intermediate results for immediate use by the same query.
Number of returning rows proves this (1000 in query - but 41852 in result, because it is flattened:
You also can run below query to see that NEST() actually works:
SELECT corpus, COUNT(1) AS cnt
FROM (
SELECT corpus, NEST(word)
FROM [publicdata:samples.shakespeare]
GROUP BY corpus
LIMIT 1000
)
GROUP BY corpus