I have two simple tables:
CREATE TABLE cat_urls (
Id int(11) NOT NULL AUTO_INCREMENT,
SIL_Id int(11) NOT NULL,
SiteId int(11) NOT NULL,
AsCatId int(11) DEFAULT NULL,
Href varchar(2048) NOT NULL,
ReferrerHref varchar(2048) NOT NULL DEFAULT '',
AddedOn datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
GroupId int(11) DEFAULT NULL,
PRIMARY KEY (Id),
INDEX SIL (SIL_Id, AsCatId)
)
CREATE TABLE products (
Id int(11) NOT NULL AUTO_INCREMENT,
CatUrlId int(11) NOT NULL,
Href varchar(2048) NOT NULL,
SiteIdentity varchar(2048) NOT NULL,
Price decimal(12, 2) NOT NULL,
IsAvailable bit(1) NOT NULL,
ClientCode varchar(256) NOT NULL,
PRIMARY KEY (Id),
INDEX CatUrl (CatUrlId)
)
And I have pretty simple query:
SELECT cu.Href, COUNT(p.CatUrlId) FROM cat_urls cu
JOIN products p ON p.CatUrlId=cu.Id
WHERE sil_id=4601038
GROUP by cu.Id
EXPLAIN says:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE cu ref PRIMARY,SIL SIL 4 const 303 Using where; Using temporary; Using filesort
1 SIMPLE p ref CatUrl CatUrl 4 blue_collar_logs.cu.Id 6 Using index
Please tell me is there any way to fix "Using where; Using temporary; Using filesort" and improve perfomance of this query?
It looks that, for some reason,
MySQL
chooses to use the indexSIL
on the first table and it uses it both for lookup (WHERE sil_id = 4601038
) and grouping (GROUP BY cu.Id
).You can tell it to use the
PK
of the tableand it will produce this execution plan:
Ignore the values reported in column
rows
; they are not correct because my tables are empty.Notice the
Extra
column now contains onlyUsing where
but also notice that the jointype
column changed fromref
(very good) toindex
(full index scan, not quite good).A better solution is to add an index on column
SIL_Id
. I know,SIL_Id
is a prefix of indexSIL(SIL_Id, AsCatId)
and in theory another index on columnSIL_Id
is completely useless. But it seems it solves the issue on this case.Now use it in the query:
The query execution plan looks much better now:
The drawback is that we have an extra index that is (theoretically) useless. It occupies storage space and it consumes processor cycles every time a row is added, deleted or have its
SIL_Id
field modified.