Having years of experience as a DBA, I do believe I know the answer to the question, but I figured it never hurts to check my bases.
Using SQL Server, assuming I have a table which has an index on column A
and column B
, and a second index on columns A
, B
, and C
, would it be safe to drop the first index, as the second index basically would satisfy queries that would benefit from the first index?
Yes, this is a common optimization. Any query that would benefit from the index on A,B can also benefit just as well from the index on A,B,C.
In the MySQL community, there's even a tool to search your whole schema for redundant indexes: http://www.percona.com/doc/percona-toolkit/pt-duplicate-key-checker.html
The possible exception case would be if the index on A,B were more compact and used much more frequently, and you wanted to control which index was kept loaded in memory.
I typically would find this "almost" similar index in table that contains historical data. If
column C
is a date or integer column, be careful. It is most likely used to satisfy the MAX function as inWHERE tblA.C = MAX(tblB.C)
, which skips the table altogether and utilize an index only access path.It depends, but the answer is often 'Yes, you could drop the index on (A,B)'.
The counter-case (where you would not drop the index on (A,B)) is when the index on (A,B) is a unique index that is enforcing a constraint; then you do not want to drop the index on (A,B). The index on (A,B,C) could also be unique, but the uniqueness is redundant because the (A,B) combination is unique because of the other index.
But in the absence of such unusual cases (for example, if both (A,B) and (A,B,C) allow duplicate entries), then the (A,B) index is logically redundant. However, if the column C is 'wide' (a CHAR(100) column perhaps), whereas A and B are small (say INTEGER), then the (A,B) index is more efficient than the (A,B,C) index because you can get more information read per page of the (A,B) index. So, even though (A,B) is redundant, it may be worth keeping. You also need to consider the volatility of the table; if the table seldom changes, the extra indexes don't matter much; if the table changes a lot, extra indexes slow up modifications to the table. Whether that's significant is difficult to guess; you probably need to do the performance measurements.
Much of what I was thinking was written by Jonathan in previous answer. Uniqueness, faster work, and one other thing I think he missed.
If the first index is made
A desc, B asc
and secondA asc, B asc, C asc
, then deleting he first index isn't really a way to go, because the second one isn't a superset of the first one, and your query cannot benefit from the second index if ordering is as written in the first one.In some cases like when you use the first index, you can
order by A desc, B asc
(of course) andA asc, B desc
, but you can also make a query that will use any part of that index, likeOrder by A desc
.But a query like
order by A asc, B asc
, will not be 'covered' by the first index.So I would add up, you can usually delete the first index, but that depends on your table configuration and your query (and, of course, indexes).
The first index covers queries that look up on
A
,A,B
and the second index can be used to cover queries that look up onA
,A,B
orA,B,C
which is clearly a superset of the first case.If
C
is very wide however the index onA,B
may still be useful as it can satisfy certain queries with fewer reads.e.g. if
C
was achar(800)
column the following query may benefit significantly from having the narrower index available.