mysql: multiple indexes advice

2019-08-01 12:28发布

I have a single table where I need to query based on 4 columns a,b,c,d The most common query will be a select based on all 4 columns at the same time, however I need to be able to search quickly for each of the columns taken separately, or also combinations of them (e.g. a&b, a&d, b&c&d and so on). Shall I create an index for every combination? or it's better to have only an index for a&b&c&d and one for a, one for b, one for c, one for d? in this last case a query that matches only a&b for example will be sped up because both a and be have an index?

2条回答
够拽才男人
2楼-- · 2019-08-01 12:41

If you want to satisfy all the combinations with an index, you need the following:

(a, b, c, d)
(a, b, d)
(a, c, d)
(a, d)
(b, c, d)
(b, d)
(c, d)
d

You don't need other combinations because any prefix of an index is also an index. The first index will be used for queries that test just a, a&b, a&b&c, so you don't need indexes for those combinations.

Whether you really need all these indexes depends on how much data you have. It's possible that just having indexes on each column will narrow down the search sufficiently that you don't need indexes on the combinations. The only real way to tell is by benchmarking the performance of your applications. The indexes take up disk space and memory, so trying to create all possible indexes can cause problems of its own; you need to determine if the need is strong enough.

查看更多
【Aperson】
3楼-- · 2019-08-01 13:00

One thing to note is that a "range" is only useful as the last item in an index:

WHERE x=2 AND y>5   -- INDEX(x,y) is useful; INDEX(y,x) only uses `y`
WHERE x=2 AND y BETWEEN 11 AND 22 -- ditto
WHERE x=2 AND s LIKE 'foo%'  -- ditto

Another thing: "flags" (true/false, etc) are useless to index by themselves. They can be somewhat useful in combination:

WHERE published=1 AND ...

Also, order matters in the INDEX, but not in the WHERE: Suppose you have INDEX(a,b):

WHERE a=1 AND b=2 -- good index
WHERE b=2 AND a=1 -- equally good
WHERE a=1         -- the index is good
WHERE b=2         -- the index is useless

If some column is always a range (such as a date), it gets messier. For optimal indexing two indexes are needed here:

WHERE d BETWEEN ...  -- needs INDEX(d)
WHERE a=1 AND d BETWEEN ...  -- needs INDEX(a,d)

So, I might do these:

  • Make all 2-column combinations of a,b,c,d -- This would be 6 combinations if nothing is involved in "ranges". I would be sure to vary which col starts the indexes: ab, bc, cd, da, ac, db
  • Turn on the slowlog to see what is not being well indexed.
  • Log the actual combinations that people use. Some combinations will be very rarely used. Get rid of the indexes that are useless.

More on understanding index creation.

查看更多
登录 后发表回答