I have a single table with four columns...
`id` INT(11) NOT NULL AUTO_INCREMENT
`tid` INT(11) NOT NULL
`cid` INT(11) NOT NULL
`name` NVARCHAR(4096) NULL DEFAULT NULL
id is the unique primary key. The other columns are not unique.
I want to return the list of all id values that have specific tid and cid values and are sorted by the name. So this...
select id
from myTable
where cid = 1 && tid = 1
order by name
There are about 125k records in the table and there should be around 50k that happen to match this criteria. All four columns have individual indexes.
On my machine the query takes around 140ms to run. I need to get this down to around 20ms or better. I thought the solution was to add a new covering index that is defined against cid, tid and name, in that order. Did not make any difference though.
Any ideas? Is my covering index incorrectly setup?
I think there're some problems with the query and the table definition itself.
Table.name
is a 4K char column
- The query is sorting by that column
You're sorting based on a column in which you're storing strings. In order to sort by strings, string comparisons have to be performed. String comparison tends to be a slow operation and, given the size of the column you're using, it's very likely to cause a noticeable performance hit.
We don't have an indication of the contents of your name
column and it seems difficult to think of an actual name that would require that many characters.
If this string has several pieces of data that are conceptually different, perhaps the column should be broken down into multiple separate columns, if possible, and then normalized as appropriate.
If you can break the contents of that column into multiple smaller ones and then use those, the string comparisons, although still expensive, would be 'faster' simply because the strings being compared will be significantly shorter than what they're now.
Another thing to consider is if you can optimize the search by avoiding string comparisons altogether or by avoiding queries that will cause a full table scan despite the fact that you've defined indices.
For that you should look at using explain
with your query, so that you can get a better understanding of the Query Execution Plan
Quoting the docs (my emphasis):
Depending on the details of your tables, columns, indexes, and the
conditions in your WHERE clause, the MySQL optimizer considers many
techniques to efficiently perform the lookups involved in an SQL
query. ... Your goals are ... to learn the SQL syntax and indexing techniques to improve the plan if you see some inefficient operations.
Edit 1
You've clarified that your name
column is actually for user notes. In this case, I think you should consider the following (in addition to what has been mentioned already):
- Rename the column to something that correlates to its actual contents
- Remove the index from the column
- Do not use that column for searching, sorting, or any other operation other than just selecting it to display it (It'd be very rare if it needed to be used for anything else, IMHO.)
- Optionally, consider changing the column into a
text
type and you won't have to worry that much about user essays getting truncated without warning (unless the GUI has enforced the same input length limit to the user)
INDEX(cid, tid, name)
will speed up the query considerably.
However, that assumes name
is a civilized length, such as under 255. If you must have a longer name
, then this is the best you can do:
INDEX(cid, tid) -- (in either order)
No, a "prefix" index will not help: INDEX(cid, tid, name(99))
. Prefix indexes are useless for ORDER BY
.
All four columns have individual indexes.
Individual indexes are not the same as composite indexes. Sometimes they are better; usually they are not.
I provide more details in my Index Cookbook.