How can I efficiently search json data in a mysql database?
I installed the extract_json udf from labs.mysql.com and played around with a test table with 2.750.000 entries.
CREATE TABLE `testdb`.`JSON_TEST_TABLE` (
`AUTO_ID` INT UNSIGNED NOT NULL AUTO_INCREMENT,
`OP_ID` INT NULL,
`JSON` LONGTEXT NULL,
PRIMARY KEY (`AUTO_ID`)) $$
An example JSON field would look like so:
{"ts": "2014-10-30 15:08:56 (9400.223725848107) ", "operation": "1846922"}
I found that putting json_extract into a select statement has virtually no performance impact. I.e. the following selects (almost) have the same performance:
SELECT * FROM JSON_TEST_TABLE where OP_ID=2000000 LIMIT 10;
SELECT OP_ID, json_extract(JSON, "ts") ts, json_extract(JSON, "operation") operation FROM JSON_TEST_TABLE where OP_ID=2000000 LIMIT 10;
However, as soon as I put a json_extract expression into the where clause the execution time increases by a factor of 10 or more (I went from 2,5 to 30 secs):
SELECT OP_ID, json_extract(JSON, "ts") ts, json_extract(JSON, "operation") operation FROM JSON_TEST_TABLE where json_extract(JSON, "operation")=2000000 LIMIT 10;
At this point I am thinking that I need to extract all info that I want to search into separate columns at insert time, and that if I really have to search in the json data I need to first narrow down the number of rows to be searched by other criteria, but I would like to make sure that I am not missing anything obvious. E.g. can I somehow index the json fields? Or is my select statement inefficiently written?