The table I am working with has 3 fields:
userId, timestamp, version
I am running the following query:
select userid, MAX(version) as current_version FROM my_table GROUP EACH BY userId;
The response I get is:
"errors": [
{
"reason": "responseTooLarge",
"message": "Response too large to return."
}
The size of the table is 644MB and it has 12,279,432 rows.
I thought GROUP EACH BY
does not have the result size restrictions because it is distributed across multiple nodes. Anyway, What can I do about it?
According to the comments, the user base is over 17 million rows? This means the query response will have at least 17 million rows, a result too large to handle.
The right query will depend on what your goal is. Do you really want to get a 17 million row answer? Or you only care about the max(version) for a particular set of users?
For example: