MySQL的不采摘从索引正确的行数(MySQL not picking correct row co

我有一个表

CREATE TABLE `test_series_analysis_data` (
  `email` varchar(255) NOT NULL,
  `mappingId` int(11) NOT NULL,
  `packageId` varchar(255) NOT NULL,
  `sectionName` varchar(255) NOT NULL,
  `createdAt` datetime(3) DEFAULT NULL,
  `marksObtained` float NOT NULL,
  `updatedAt` datetime DEFAULT NULL,
  `testMetaData` longtext,
  PRIMARY KEY (`email`,`mappingId`,`packageId`,`sectionName`),
  KEY `rank_index` (`mappingId`,`packageId`,`sectionName`,`marksObtained`),
  KEY `mapping_package` (`mappingId`,`packageId`)
  ) ENGINE=InnoDB DEFAULT CHARSET=utf8 |

以下是用于查询的解释输出：

explain select rank 
from (
   select email, @i:=@i+1 as rank 
   from test_series_analysis_data ta 
   join (select @i:=0) va 
   where mappingId = ?1 
   and packageId = ?2 
   and sectionName = ?3 
   order by marksObtained desc
) as inter 
where inter.email = ?4;

+----+-------------+------------+------------+--------+----------------------------+-------------+---------+-------+-------+----------+--------------------------+
| id | select_type | table      | partitions | type   | possible_keys              | key         | key_len | ref   | rows  | filtered | Extra                    |
+----+-------------+------------+------------+--------+----------------------------+-------------+---------+-------+-------+----------+--------------------------+
|  1 | PRIMARY     | <derived2> | NULL       | ref    | <auto_key0>                | <auto_key0> | 767     | const |    10 |   100.00 | NULL                     |
|  2 | DERIVED     | <derived3> | NULL       | system | NULL                       | NULL        | NULL    | NULL  |     1 |   100.00 | Using filesort           |
|  2 | DERIVED     | ta         | NULL       | ref    | rank_index,mapping_package | rank_index  | 4       | const | 20160 |     1.00 | Using where; Using index |
|  3 | DERIVED     | NULL       | NULL       | NULL   | NULL                       | NULL        | NULL    | NULL  |  NULL |     NULL | No tables used           |
+----+-------------+------------+------------+--------+----------------------------+-------------+---------+-------+-------+----------+--------------------------+

查询优化器也可以使用这两个指标，但rank_index是一个覆盖索引所以它得到了回升。令我惊讶的是以下查询的输出：

explain select rank 
from ( 
  select email, @i:=@i+1 as rank 
  from test_series_analysis_data ta use index (mapping_package) 
  join (select @i:=0) va 
  where mappingId = ?1 
  and packageId = ?2 
  and sectionName = ?3 
  order by marksObtained desc
) as inter 
where inter.email = ?4;

+----+-------------+------------+------------+--------+-----------------+-----------------+---------+-------+-------+----------+-----------------------+
| id | select_type | table      | partitions | type   | possible_keys   | key             | key_len | ref   | rows  | filtered | Extra                 |
+----+-------------+------------+------------+--------+-----------------+-----------------+---------+-------+-------+----------+-----------------------+
|  1 | PRIMARY     | <derived2> | NULL       | ref    | <auto_key0>     | <auto_key0>     | 767     | const |    10 |   100.00 | NULL                  |
|  2 | DERIVED     | <derived3> | NULL       | system | NULL            | NULL            | NULL    | NULL  |     1 |   100.00 | Using filesort        |
|  2 | DERIVED     | ta         | NULL       | ref    | mapping_package | mapping_package | 4       | const | 19434 |     1.00 | Using index condition |
|  3 | DERIVED     | NULL       | NULL       | NULL   | NULL            | NULL            | NULL    | NULL  |  NULL |     NULL | No tables used        |
+----+-------------+------------+------------+--------+-----------------+-----------------+---------+-------+-------+----------+-----------------------+

为什么会有rows较小（19434 <20160）时所使用的索引是mapping_package。 rank_index可以更好地选择什么是必需的，行数应该在rank_index较小。

那么，这是否意味着mapping_package指数比rank_index对于给定的查询更好吗？

是否有sectionName是一个varchar因此两个指标应该给类似的性能有什么影响？

我也是假设Using index condition是只选择从指数几行和扫描更多一些。而在情况Using where; Using index Using where; Using index ，优化器只能读取索引，而不是表让行，然后是选择一些数据。那么，为什么Using where ，同时使用rank_index失踪？

而且，为什么是mapping_package的key_len是4时在索引中只有两列？

帮助表示赞赏。

(19434<20160) -这两种这些数字是估计值。这是不寻常的他们是亲密。我敢打赌，如果你做ANALYZE TABLE ，都将改变，可能改变不平等。

注意别的事情： Using where; Using index Using where; Using index与Using index condition 。

但首先，让我提醒你的是，在InnoDB中的PRIMARY KEY列上涨到二级钥匙。因此，有效地你有

KEY `rank_index`      (`mappingId`,`packageId`,`sectionName`,`marksObtained`,`email`)
KEY `mapping_package` (`mappingId`,`packageId`,`email`,`sectionName`)

现在让我们来决定的最佳指标应该是什么：？？？其中mappingId = 1和包ID = 2和sectionName = 3 ORDER BY marksObtained降序

首先， =部分WHERE ： mappingId ， packageId ， sectionName ，以任何顺序;
那么ORDER BY列（S）： marksObtained
奖励：最后，如果email （在任何地方提到的唯一另一列SELECT ）是关键，这将是“覆盖”。

这是说rank_index是“完美”，其他指标也不是那么好。唉， EXPLAIN没有明确说。

你也可能意识到了这一点-你需要的一切，是研究我的博客： http://mysql.rjweb.org/doc.php/index_cookbook_mysql （对不起，已经晚了，而且我越来越厚脸皮）

其他提示：

不要盲目的使用(255) 当需要TMP表，这可以使该TMP表越大，因而效率较低。将这个限制的东西合理。要么...
如果这是一个巨大的表，你真的应该“正常化”的字符串，可能与2字节替换它们SMALLINT UNSIGNED 。这将提高在其他方面，如减少昂贵的I / O性能。（OK，20行是非常小的，所以这可能并不适用。）

为什么key_len 4？这意味着，一列使用，也就是4个字节的INT mappingId 。我本来期望它使用的第二列了。所以，我很为难。 EXPLAIN FORMAT=JSON SELECT ...可以提供更多的线索。