Between statement missing indexes in certain cases

2019-07-15 12:21发布

问题:

I have a large database of numbers that I am looking for a match in between:

For example:

1112203488

My table looks like this:

| sATON | eATON | info |

I have two index's on sATON and eATON (named s and e)

So my SQL is as follows:

SELECT * 
FROM  `data2` 
FORCE INDEX ( s, e ) 
WHERE 1112203488
BETWEEN  `sATON` 
AND  `eATON`;

So usually when the index is used the query will take nearly zero time (0.02). However, sometimes it looks like the table stats in MySQL is taking the decision to do a full table scan instead, despite me forcing the use of the indexes in the SQL. This is a huge performance hit as it takes the query from 0.02 to 120 s.

Here are some samples that run fast (using indexes) :

67372289
134744072

And the slow ones:

1112203488
1348203839

If it helps the indexes are using BTREE.

回答1:

If any such query will return maximum one row, this means that the (sATON, eATON) ranges are not overlapping.

Therefore, and only if the ranges are not overlapping, you can use this query instead:

SELECT * 
FROM data2 
WHERE sATON = 
      ( SELECT MAX(sATON)
        FROM data2 
        WHERE sATON <= 1112203488
      )
  AND eATON = 
      ( SELECT MIN(eATON)
        FROM data2 
        WHERE eATON >= 1112203488
      )

or even this (that will need to use just one index, the sATON one):

SELECT * 
FROM data2 
WHERE sATON = 
      ( SELECT MAX(sATON)
        FROM data2 
        WHERE sATON <= 1112203488
      )
  AND eATON >= 1112203488


回答2:

As pointed out by ypercube, only one index can be used for your current query. From your execution plan I can see that it uses the s index, which means that it scans all rows where the value is greater than sAton. For high values, this will be almost all values, making it as inefficient as a full table scan.

I would go with ypercubes suggested solution, it should be able to use both indexes efficiently.

Using bind variables serves several purposes:

  1. Eliminates the risk of SQL injection
  2. Allows the database to reuse compiled SQL statements
  3. Reduces the number of different questions, making better use of the datbase cache for recently used questions

I've mostly worked with Oracle, so I'm not sure how valid 2 and 3 are for mysql.

If you want to know how to use them, just google for: mysql bind variable example +your programming language