Is it possible to calculate sum of json values grouped by json keys?
Mysql version is 5.7.17 on Google cloud sql.
Example_1: A short example of my point:
col1 | col2
-----|-----------------------
aaa | {"key1": 1, "key2": 3}
-----|-----------------------
bbb | {"key1": 0, "key2": 2}
-----|-----------------------
aaa | {"key1": 50, "key2": 0}
SQL query should produce:
col1 | col2
-----|-----------------------
aaa | {"key1": 51, "key2": 3}
-----|-----------------------
bbb | {"key1": 0, "key2": 2}
OR
will it be possible with any of the below schema?
Example_2:
col1 | col2
-----|-----------------------
aaa | {{"key_name" : "key1", "key_value" : 1}, {"key_name" : "key2", "key_value" : 3}}
-----|-----------------------
bbb | {{"key_name" : "key1", "key_value" : 0}, {"key_name" : "key2", "key_value" : 2}}
-----|-----------------------
aaa | {{"key_name" : "key1", "key_value" : 50}, {"key_name" : "key2", "key_value" : 0}}
Example_3:
col1 | col2
-----|-----------------------
aaa | [{"key_name" : "key1", "key_value" : 1}, {"key_name" : "key2", "key_value" : 3}]
-----|-----------------------
bbb | [{"key_name" : "key1", "key_value" : 0}, {"key_name" : "key2", "key_value" : 2}]
-----|-----------------------
aaa | [{"key_name" : "key1", "key_value" : 50}, {"key_name" : "key2", "key_value" : 0}]
Example_4:
col1 | col2
-----|-----------------------
aaa | {"key1": {"key_name" : "key1", "key_value" : 1}, "key2": {"key_name" : "key2", "key_value" : 3}}
-----|-----------------------
bbb | {"key1": {"key_name" : "key1", "key_value" : 0}, "key2": {"key_name" : "key2", "key_value" : 2}}
-----|-----------------------
aaa | {"key1": {"key_name" : "key1", "key_value" : 50}, "key2": {"key_name" : "key2", "key_value" : 0}}
TL;DR: yes, it can be done without knowing the key names in advance, and none of the alternate data formats has any advantage over the original.
This can be done without knowing the key names in advance but it is painful... basically you have to look at every value in the table to determine the set of distinct keys in the table before you can sum them. Because of this requirement, and the fact that the alternate data formats can all have multiple keys per entry, there is no advantage to using any of them.
Since you have to look for all the distinct keys, it is as easy to do the sums while you are looking for them. This function and procedure together will do that. The function,
json_merge_sum
, takes two JSON values and merges them, summing the values where a key appears in both values e.g.Output:
The function code:
The procedure,
count_keys
, performs the equivalent of theGROUP BY
clause. It finds all the distinct values ofcol1
in the table and then callsjson_sum_merge
for each row which has that value ofcol1
. Note the row select query performs aSELECT ... INTO
a dummy variable so no output is generated, and uses aMIN()
to ensure there is only one result (so that it can be assigned to a variable).The procedure:
For a slightly larger example:
CALL count_keys()
produces:Note I've called the table
table2
in the procedure, you will need to edit that (in both queries) to suit.I believe that something like this could work.
SQL for the "short example":
Solution for Example_3:
You can run it here