What mysql settings affect the speed of LOAD DATA

2019-03-29 10:23发布

Let me set up the situation. We are trying to insert a modestly high number of rows (roughly 10-20M a day) into a MyISAM table that is modestly wide:

+--------------+--------------+------+-----+---------+-------+
| Field        | Type         | Null | Key | Default | Extra |
+--------------+--------------+------+-----+---------+-------+
| blah1        | varchar(255) | NO   | PRI |         |       | 
| blah2        | varchar(255) | NO   | PRI |         |       | 
| blah3        | varchar(5)   | NO   | PRI |         |       | 
| blah4        | varchar(5)   | NO   | PRI |         |       | 
| blah5        | varchar(2)   | NO   | PRI |         |       | 
| blah6        | varchar(2)   | NO   | PRI |         |       | 
| blah7        | date         | NO   | PRI |         |       | 
| blah8        | smallint(6)  | NO   | PRI |         |       | 
| blah9        | varchar(255) | NO   | PRI |         |       | 
| blah10       | bigint(20)   | YES  |     | NULL    |       | 
+--------------+--------------+------+-----+---------+-------+

The only index besides that whopping primary key is on the blah7, the date field. We are using LOAD DATA INFILE and seeing what strikes me as pretty awful performance, around 2 hours to load the data. I was led to believe that LOAD DATA INFILE was orders of magnitude faster than that.

Interestingly, we have some less fat tables (5-6 fields) that we also use LOAD DATA INFILE to batch data into and we see much better performance on those. The number of records is quite a bit smaller, which leads me to think that we are running up against a buffer size limit when we load the large table, and are having to go to disk (and really, what else but going to disk would explain such slow load times?).

...which brings me to my question. What my.cnf settings are most important when dealing with LOAD DATA INFILE commands?

3条回答
爱情/是我丢掉的垃圾
2楼-- · 2019-03-29 10:57

I don't know about settings, but my money is on that composite primary key as to why you are having such poor performance.

查看更多
Summer. ? 凉城
3楼-- · 2019-03-29 11:03

Inserting into indexes in general kills performance. You may be better off removing the index before inserting data and re-indexing after insertion.

From http://forum.percona.com/s/m/983/:

Normally MySQL is rather fast loading data in MyISAM table, but there is exception, which is when it can't rebuild indexes by sort but builds them row by row instead. It can be happening due to wrong configuration (ie too small myisam_max_sort_file_size or myisam_max_extra_sort_file_size) or it could be just lack of optimization, if you're having large (does not fit in memory) PRIMARY or UNIQUE indexes.

Also check out http://www.mysqlperformanceblog.com/2007/05/24/predicting-how-long-data-load-would-take/ and http://www.linuxtopia.org/online_books/database_guides/mysql_5.1_database_reference_guide/insert-speed.html.

查看更多
Bombasti
4楼-- · 2019-03-29 11:08

If your Table is MyISam and if data is being added to a non empty table, bulk_insert_buffer_size matters

MyISAM uses a special tree-like cache to make bulk inserts faster for LOAD DATA INFILE when adding data to nonempty tables. The BULK_INSERT_BUFFER_SIZE variable limits the size of the cache tree in bytes per thread. Setting it to 0 disables this optimization. The default value is 8MB. The maximum value is 4GB.

If data is being added to a non-empty table, tuning the bulk_insert_buffer_size variable can make data insertion faster. Normally, it shows the improvement when the data to be inserted is more than 10k rows. But it is hard to say what is the right value, so, trail and try with incremental buffer size values.

Also, performance improvement will be more obvious if the following variables are also set:

  • MYISAM_SORT_BUFFER_SIZE
  • KEY_BUFFER_SIZE

You should also consider disabling the indexes before loading the data using the following alter table command:

alter table t disable keys;
查看更多
登录 后发表回答