Mysql not using DATETIME index when table has othe

2020-07-23 05:33发布

问题:

I need some help figuring this out. I'm trying to get Mysql to use an index on a DATETIME field.

Mysql decides not to use the index if there's other (not used) fields in the table. Consider the two cases below:

A simple table with 2 fields works fine:

DROP TABLE IF EXISTS datetime_index_test;
CREATE TABLE  datetime_index_test (
id INT UNSIGNED NOT NULL AUTO_INCREMENT ,
created DATETIME NOT NULL ,
PRIMARY KEY (id) ,
INDEX (created)
) ENGINE = InnoDB ;

INSERT INTO datetime_index_test (created) VALUES
('2011-04-06 00:00:00'),
('2011-04-06 01:00:00'),
('2011-04-06 02:00:00'),
('2011-04-06 03:00:00'),
('2011-04-06 04:00:00'),
('2011-04-06 05:00:00'),
('2011-04-06 06:00:00'),
('2011-04-06 00:00:00');

EXPLAIN SELECT * FROM datetime_index_test
WHERE created <= '2011-04-06 04:00:00';

+----+-------------+---------------------+-------+---------------+---------+---------+------+------+--------------------------+
| id | select_type | table               | type  | possible_keys | key     | key_len | ref  | rows | Extra                    |
+----+-------------+---------------------+-------+---------------+---------+---------+------+------+--------------------------+
|  1 | SIMPLE      | datetime_index_test | range | created       | created | 4       | NULL |    4 | Using where; Using index |
+----+-------------+---------------------+-------+---------------+---------+---------+------+------+--------------------------+

A simple table with 3 fields, does not works fine:

DROP TABLE IF EXISTS datetime_index_test;
CREATE TABLE  datetime_index_test (
id INT UNSIGNED NOT NULL AUTO_INCREMENT ,
created DATETIME NOT NULL ,
user int(10) unsigned DEFAULT 0,
PRIMARY KEY (id) ,
INDEX (created)
) ENGINE = InnoDB ;

INSERT INTO datetime_index_test (created) VALUES
('2011-04-06 00:00:00'),
('2011-04-06 01:00:00'),
('2011-04-06 02:00:00'),
('2011-04-06 03:00:00'),
('2011-04-06 04:00:00'),
('2011-04-06 05:00:00'),
('2011-04-06 06:00:00'),
('2011-04-06 00:00:00');

EXPLAIN SELECT * FROM datetime_index_test
WHERE created <= '2011-04-06 04:00:00';

+----+-------------+---------------------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table               | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+---------------------+------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | datetime_index_test | ALL  | created       | NULL | NULL    | NULL |    8 | Using where |
+----+-------------+---------------------+------+---------------+------+---------+------+------+-------------+

Finally, my question; Can anyone explain to me why Mysql decides not to use the index?

回答1:

This is due to what I call the 5% rule based on key population (tuple cardinality).

If you index a table where lopsided cardinality exist, the MySQL Query Optimizer wiill always choose the path of least resistance.

EXAMPLE : If a table has a gender column, cardinality is two, M and F.

What is you index such a gender column ??? You essentailly get two giant linked lists.

If you load one million rows into a table with a gender column, you may get 50% M and 50% F.

An index is rendered useless during query optimization if the cardinality of a key combo (key population as I phrased it) is more than 5% of the total table count.

Now, with regard to your example, why the two different EXPLAIN plans ??? My guess is the MySQL Query Optimizer and InnoDB as a tag team.

In the first CREATE TABLE, the table and the indexes are about the same size though small, so it decided in favor of the index by doing an index scan not a full table scan. Keep in mind that non-unique indexes carry around each row's internal primary key (RowID) in its index entries, thus making the indexes almost the same size as the table itself.

In the second CREATE TABLE, because of the introduction of another column, user, you now make the Query Optimizer see a completely different scenario: The table is now bigger that the indexes. Hence, the Query Optimizer became more strict in its interpretation of how to use available indexes. It went to the 5% rule I mentioned before. That rule failed miserably and the Query Optimizer decided in favor of a full table scan.