MySQL index for normal column and full text column

2019-08-09 03:59发布

I'm trying to speed up a query for the below:

My table has around 4 million records.

EXPLAIN SELECT  * FROM chrecords WHERE  company_number = 'test'  OR MATCH (company_name,registered_office_address_address_line_1,registered_office_address_address_line_2) AGAINST('test') LIMIT 0, 10;
+------+-------------+-----------+------+------------------+------+---------+------+---------+-------------+
| id   | select_type | table     | type | possible_keys    | key  | key_len | ref  | rows    | Extra       |
+------+-------------+-----------+------+------------------+------+---------+------+---------+-------------+
|    1 | SIMPLE      | chrecords | ALL  | i_company_number | NULL | NULL    | NULL | 2208348 | Using where |
+------+-------------+-----------+------+------------------+------+---------+------+---------+-------------+
1 row in set (0.00 sec)

I've created two indexes using the below:

ALTER TABLE `chapp`.`chrecords` ADD INDEX `i_company_number` (`company_number`);

ALTER TABLE `chapp`.`chrecords`ADD FULLTEXT(
    `company_name`,
    `registered_office_address_address_line_1`,
    `registered_office_address_address_line_2`
);

How can "combine" the two indexes however? As the above query takes 15+ seconds to execute (only using one index).

The entire table definition:

CREATE TABLE `chapp`.`chrecords` (
  `id` INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
  `company_name` VARCHAR(100) NULL,
  `company_number` VARCHAR(100) NULL,
  `registered_office_care_of` VARCHAR(100) NULL,
  `registered_office_po_box` VARCHAR(100) NULL,
  `registered_office_address_address_line_1` VARCHAR(100) NULL,
  `registered_office_address_address_line_2` VARCHAR(100) NULL,
  `registered_office_locality` VARCHAR(100) NULL,
  `registered_office_region` VARCHAR(100) NULL,
  `registered_office_country` VARCHAR(100) NULL,
  `registered_office_postal_code` VARCHAR(100) NULL
  );

ALTER TABLE `chapp`.`chrecords` ADD INDEX `i_company_name` (`company_name`);
ALTER TABLE `chapp`.`chrecords` ADD INDEX `i_company_number` (`company_number`);
ALTER TABLE `chapp`.`chrecords` ADD INDEX `i_registered_office_address_address_line_1` (`registered_office_address_address_line_1`);
ALTER TABLE `chapp`.`chrecords` ADD INDEX `i_registered_office_address_address_line_2` (`registered_office_address_address_line_2`);

ALTER TABLE `chapp`.`chrecords`ADD FULLTEXT(
    `company_name`,
    `registered_office_address_address_line_1`,
    `registered_office_address_address_line_2`
);

2条回答
地球回转人心会变
2楼-- · 2019-08-09 04:32

Try using a UNION rather than OR.

  SELECT *
    FROM (
       SELECT  * 
        FROM chrecords 
        WHERE company_number = 'test'
    ) a
    UNION (
       SELECT * 
         FROM cbrecords
        WHERE MATCH (company_name, 
                     registered_office_address_address_line_1, 
                     registered_office_address_address_line_2)
              AGAINST('test') 
        LIMIT 0, 10
     ) b

If this helps, it's because MySQL struggles to use more than one index in a single subquery. This gives the query planner two queries.

You can run EXPLAIN on each of the subqueries separately to understand their performance. UNION just puts their results together and eliminates duplicates. If you want to keep the duplicates, do UNION ALL.

Please notice that lots of single-column indexes on MySQL tables are generally harmful to performance. You should refrain from creating indexes unless they're constructed to help specific queries.

查看更多
放我归山
3楼-- · 2019-08-09 04:33
    (
        SELECT  *
            FROM  chrecords
            WHERE  company_number = 'test' 
            ORDER BY something
            LIMIT 10
    )
    UNION DISTINCT
    (
        SELECT  *
            FROM  cbrecords
            WHERE  MATCH (company_name, registered_office_address_address_line_1,
                                        registered_office_address_address_line_2)
                   AGAINST('test')
            ORDER BY something
            LIMIT 10
    ) 
    ORDER BY something
    LIMIT 10

Notes:

  • No need for an outer SELECT
  • Explicitly say DISTINCT (the default) or ALL (which is faster) so that you will know that you thought about whether dedupping was needed, versus speed.
  • A LIMIT without an ORDER BY is not very meaningful
  • However, if you just want some rows to look at, you can remove the ORDER BYs.
  • Yes the ORDER BY and LIMIT need to be repeated outside so that you can get the ordering correct and limit to 10.

If you need an OFFSET, the the inside need a full count, say LIMIT 50 for 5 pages, the n the outside needs to skip to the 5th page: LIMIT 40,10.

查看更多
登录 后发表回答