Slow INSERT query on 200m table

2019-08-04 01:18发布

We have the following table with about 200 million records:

CREATE TABLE IF NOT EXISTS `history` (
  `airline` char(2) NOT NULL,
  `org` char(3) NOT NULL,
  `dst` char(3) NOT NULL,
  `departat` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
  `arriveat` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
  `validon` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `price` int(11) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8
/*!50500 PARTITION BY RANGE  COLUMNS(org)
(PARTITION p0 VALUES LESS THAN ('AHI') ENGINE = MyISAM,
 PARTITION p1 VALUES LESS THAN ('ARO') ENGINE = MyISAM,
 ...
 PARTITION p39 VALUES LESS THAN ('WMA') ENGINE = MyISAM,
 PARTITION p40 VALUES LESS THAN (MAXVALUE) ENGINE = MyISAM) */;

--
-- Indexes for table `history`
--
ALTER TABLE `history`
 ADD KEY `tail` (`org`,`dst`,`departat`);

We're doing bulk inserts of some VALUES frequently, usually up to 1000 records in simple INSERT queries, without any decoration such as ON DUPLICATE KEY (the index is not unique anyway).

Sometimes when I go to server status in phpMyAdmin, a see a bunch of INSERT statements waiting for each other, sometimes for up to 300-400 seconds. Nothing else seems to be going on the server at the particular time. We got 32 GB and otherwise excellent performance.

How to troubleshoot this issue? Thanks for help.

标签: mysql insert
3条回答
beautiful°
2楼-- · 2019-08-04 02:05

Probably first step is to do couple of test runs with profiling on.

Usually you'd do something like:

SET LOCAL PROFILING=ON;
-- run your INSERT, like:
INSERT INTO yourtable (id) VALUES (1),(2),(3);

SHOW PROFILES;
+----------+------------+------------------------------------------------+
| Query_ID | Duration   | Query                                          |
+----------+------------+------------------------------------------------+
|     1012 | 6.25220000 | INSERT INTO yourtable (id) VALUES (1),(2),(3); |
+----------+------------+------------------------------------------------+

This tells you very basic information, like duration of the query (6.25 sec in this case). To get the actual details you need to pull up the profile for said query:

SHOW PROFILE FOR QUERY 1025; 
+------------------------------+----------+
| Status                       | Duration |
+------------------------------+----------+
| starting                     | 0.004356 |
| checking permissions         | 0.000015 |
| Opening tables               | 6.202999 |
| System lock                  | 0.000017 |
| init                         | 0.000342 |
| update                       | 0.023951 |
| Waiting for query cache lock | 0.000008 |
| update                       | 0.000007 |
| end                          | 0.000011 |
| query end                    | 0.019984 |
| closing tables               | 0.000019 |
| freeing items                | 0.000304 |
| logging slow query           | 0.000006 |
| cleaning up                  | 0.000181 |
+------------------------------+----------+

You may notice that 'Opening tables' took very long. In this example query execution was delayed by locking the table (LOCK TABLES) by another process to delay the execution. Further information about the states is available in the manual.

查看更多
女痞
3楼-- · 2019-08-04 02:20

set default 0 for timestamp fields and try

eg:

departat timestamp NOT NULL DEFAULT 0,
arriveat timestamp NOT NULL DEFAULT 0,

Timestamp will store a value like integer (mean timestamp of the passing time), it will not keep record like datetime.

In your case you have set the default as datetime format in timestap field type

查看更多
戒情不戒烟
4楼-- · 2019-08-04 02:25

There are several things you can do to optimize bulk inserts. One of the things is setting off these variables if you are sure your data doesn't contain duplicates (don't forget to set them to 1 after the upload is complete):

SET AUTOCOMMIT = 0; SET FOREIGN_KEY_CHECKS = 0; SET UNIQUE_CHECKS = 0;

Also you need to check if no other users are accessing the table. You can also try using Innodb since it's said is better than MyISAM handling bulk inserts with data already on it.

Also you can check for fragmentation on your tables, sometimes the overhead the OS gives when assigning free space on fragmented drives is the cause of the delay.

查看更多
登录 后发表回答