MySQL index for MIN and MAX

2020-06-17 06:42发布

问题:

Could anyone clarify this point from the official MySQL documentation

Indexes are used ... To find the MIN() or MAX() value for a specific indexed column key_col. This is optimized by a preprocessor that checks whether you are using WHERE key_part_N = constant on all key parts that occur before key_col in the index. In this case, MySQL does a single key lookup for each MIN() or MAX() expression and replaces it with a constant. If all expressions are replaced with constants, the query returns at once. For example: SELECT MIN(key_part2),MAX(key_part2) FROM tbl_name WHERE key_part1=10;

So in their exampe they're trying to find the min and max values of the col called key_col. Are they using a composite index key_part1_key_part2? Does key_part2 part of the index represent key_col column?

I absolutely don't understand what they're trying to say of replacing the expressions by the constant. Could you help me to work out this point?

UPDATE: The question is not about how the WHERE statement or indexes work on the "highest levels", if I can say that. I was confused by this processor's expression replacing thing in this certain point about MIN and MAX.

回答1:

SELECT MIN(b), MAX(b) FROM tbl WHERE a = 12;

loves

INDEX(a, b)

both columns, in that order.

The query looks in the index for a = 12, grabs the first (a,b) pair to get MIN(b) and grabs the last pair to get MAX(b).

The statement about "replacing with a constant" is confusing because it is going too deep into the details of how it first figures out how to perform the query (which happens to get the min and max), then proceeds to execute what is left of the query (nothing is left).

More generally, the optimal index is usually one that starts with all the WHERE columns compared to constants with =. After that it gets complex, so let me give another tip:

A "covering" index is one that has all the columns mentioned in the SELECT (a and b in my example).

Sorry, I don't seem to be clearer than the manual.



回答2:

In their example they are not trying to find the min and max values of the column called key_col. They are trying to find the min and max of the second key in a multi-column key. The issue is further complicated since the query is limiting its search to the subset of the multi-column key set that is defined by having the key_part1 = 10.

10, 2, 3
10, 9, 5
10, 7, 7
10, 3, 1
10, 2, 5
11, 3, 0
11, 0, 3

So in this example the query selects only the rows where the first column is 10. Then it finds that among the rows with 10 in the first column the min for the second column is 2 and the max is 9. Notice that 0 in the second column is not chosen as the min because the first column does not match on 10. Hope that helps.