SQLite trigger optimization

2019-08-01 14:26发布

问题:

This is a follow up based on this question about query optimization.

In order to make fast selection, as suggested, I tried to pre-compute some data at insertion time using a trigger.

Basically, I want to keep the number of occurrences of a given column's value into a given table.

The following schema is used to store the occurrences for each of the values:

CREATE TABLE valuecount (value text, count int)
CREATE INDEX countidx ON t (count DESC)
CREATE UNIQUE INDEX valueidx ON valuecount (value);

And here is the trigger I've been testing with:

CREATE TRIGGER counttrigger INSERT ON collection
  FOR EACH ROW
    BEGIN
      INSERT OR REPLACE INTO valuecount VALUES 
      (NEW.value, coalesce((SELECT count + 1 FROM count WHERE valuecount.query LIKE 
       NEW.value), 1));
    END;

A typical insertion on the collection table contains a few hundreds thousands of rows, in a single INSERT INTO ... SELECT ... statement (I'm merging data from a db to another).

The problem is that with this trigger, inserting 300K rows takes 4 hours, instead of 300 ms ...

Is there a way to achieve this without sacrificing performances?

Thanks in advance for your help,

回答1:

Since importing from one database to another is not as common a task as other inserts, can you defer the logic provided by the trigger until the import is complete? Looking at the nature of the trigger, it seems you could do a COUNT for each query type after all the data is imported. This would be significantly less queries overall.

If this is not possible, is there a way to avoid using LIKE? This operation is more expensive than a direct comparison.



回答2:

Ok so it seemed that using a trigger wasn't a good idea afterall.

Since I'm merging a table from different databases, I did the upsert into the valuecount table based on the content of the table I'm merging, all in one statement, before the insert.

So instead of a trigger, I just have a query that looks like this, executed prior to my INSERT INTO ... SELECT ... statement:

INSERT OR REPLACE INTO valuecount
SELECT value, coalesce((SELECT count FROM valuecount WHERE valuecount.value = collection.value) + COUNT(value), COUNT(value))
FROM attached.collection GROUP BY value