How to improve INSERT INTO … SELECT locking behavi

2019-01-17 09:47发布

问题:

In our production database, we ran the following pseudo-code SQL batch query running every hour:

INSERT INTO TemporaryTable
    (SELECT FROM HighlyContentiousTableInInnoDb
     WHERE allKindsOfComplexConditions are true)

Now this query itself does not need to be fast, but I noticed it was locking up HighlyContentiousTableInInnoDb, even though it was just reading from it. Which was making some other very simple queries take ~25 seconds (that's how long that other query takes).

Then I discovered that InnoDB tables in such a case are actually locked by a SELECT! http://www.mysqlperformanceblog.com/2006/07/12/insert-into-select-performance-with-innodb-tables/

But I don't really like the solution in the article of selecting into an OUTFILE, it seems like a hack (temporary files on filesystem seem sucky). Any other ideas? Is there a way to make a full copy of an InnoDB table without locking it in this way during the copy. Then I could just copy the HighlyContentiousTable to another table and do the query there.

回答1:

The answer to this question is much easier now: - Use Row Based Replication and Read Committed isolation level.

The locking you were experiencing disappears.

Longer explaination: http://harrison-fisk.blogspot.com/2009/02/my-favorite-new-feature-of-mysql-51.html



回答2:

You can set binlog format like that:

SET GLOBAL binlog_format = 'ROW';

Edit my.cnf if you want to make if permanent:

[mysqld]
binlog_format=ROW

Set isolation level for the current session before you run your query:

SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
INSERT INTO t1 SELECT ....;

If this doesn't help you should try setting isolation level server wide and not only for the current session:

SET GLOBAL TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

Edit my.cnf if you want to make if permanent:

[mysqld]
transaction-isolation = READ-UNCOMMITTED

You can change READ-UNCOMMITTED to READ-COMMITTED which is a better isolation level.



回答3:

Everyone using Innodb tables probably got use to the fact Innodb tables perform non locking reads, meaning unless you use some modifiers such as LOCK IN SHARE MODE or FOR UPDATE, SELECT statements will not lock any rows while running.

This is generally correct, however there a notable exception – INSERT INTO table1 SELECT * FROM table2. This statement will perform locking read (shared locks) for table2 table. It also applies to similar tables with where clause and joins. It is important for tables which is being read to be Innodb – even if writes are done in MyISAM table.

So why was this done, being pretty bad for MySQL Performance and concurrency ?

The reason is – replication. In MySQL before 5.1 replication is statement based which means statements replied on the master should cause the same effect as on the slave. If Innodb would not locking rows in source table other transaction could modify the row and commit before transaction which is running INSERT .. SELECT statement. This would make this transaction to be applied on the slave before INSERT… SELECT statement and possibly result in different data than on master. Locking rows in the source table while reading them protects from this effect as other transaction modifies rows before INSERT … SELECT had chance to access it it will also be modified in the same order on the slave. If transaction tries to modify the row after it was accessed and so locked by INSERT … SELECT, transaction will have to wait until statement is completed to make sure it will be executed on the slave in proper order. Gets pretty complicated ? Well all you need to know it had to be done fore replication to work right in MySQL before 5.1.

In MySQL 5.1 this as well as few other problems should be solved by row based replication. I’m however yet to give it real stress tests to see how well it performs :)

One more thing to keep into account – INSERT … SELECT actually performs read in locking mode and so partially bypasses versioning and retrieves latest committed row. So even if you’re operation in REPEATABLE-READ mode, this operation will be performed in READ-COMMITTED mode, potentially giving different result compared to what pure SELECT would give. This by the way applies to SELECT .. LOCK IN SHARE MODE and SELECT … FOR UPDATE as well.

One my ask what is if I’m not using replication and have my binary log disabled ? If replication is not used you can enable innodb_locks_unsafe_for_binlog option, which will relax locks which Innodb sets on statement execution, which generally gives better concurrency. However as the name says it makes locks unsafe fore replication and point in time recovery, so use innodb_locks_unsafe_for_binlog option with caution.

Note disabling binary logs is not enough to trigger relaxed locks. You have to set innodb_locks_unsafe_for_binlog=1 as well. This is done so enabling binary log does not cause unexpected changes in locking behavior and performance problems. You also can use this option with replication sometimes, if you really know what you’re doing. I would not recommend it unless it is really needed as you might not know which other locks will be relaxed in future versions and how it would affect your replication.



回答4:

Probably you could use Create View command (see Create View Syntax). For example,

Create View temp as SELECT FROM HighlyContentiousTableInInnoDb WHERE allKindsOfComplexConditions are true

After that you could use your insert statement with this view. Something like this

INSERT INTO TemporaryTable (SELECT * FROM temp)

This is only my proposal.



回答5:

Disclaimer: I'm not very experienced with databases, and I'm not sure if this idea is workable. Please correct me if it's not.

How about setting up a secondary equivalent table HighlyContentiousTableInInnoDb2, and creating AFTER INSERT etc. triggers in the first table which keep the new table updated with the same data. Now you should be able to lock HighlyContentiousTableInInnoDb2, and only slow down the triggers of the primary table, instead of all queries.

Potential problems:

  • 2 x data stored
  • Additional work for all inserts, updates and deletes
  • Might not be transactionally sound


回答6:

If you can allow some anomalies you can change ISOLATION LEVEL to the least strict one - READ UNCOMMITTED. But during this time someone is allowed to read from ur destination table. Or you can lock destination table manually (I assume mysql is giving this functionality?).

Or alternatively you can use READ COMMITTED, which should not lock source table also. But it also locks inserted rows in destination table till commit.

I would choose second one.



回答7:

I'm not familiar with MySQL, but hopefully there is an equivalent to the transaction isolation levels Snapshot and Read committed snapshot in SQL Server. Using any of these should solve your problem.



回答8:

The reason for the lock (readlock) is to secure your reading transaction not to read "dirty" data a parallel transaction might be currently writing. Most DBMS offer the setting that users can set and revoke read & write locks manually. This might be interesting for you if reading dirty data is not a problem in your case.

I think there is no secure way to read from a table without any locks in a DBS with multiple transactions.

But the following is some brainstorming: if space is no issue, you can think about running two instances of the same table. HighlyContentiousTableInInnoDb2 for your constantly read/write transaction and a HighlyContentiousTableInInnoDb2_shadow for your batched access. Maybe you can fill the shadow table automated via trigger/routines inside your DBMS, which is faster and smarter that an additional write transaction everywhere.

Another idea is the question: do all transactions need to access the whole table? Otherwise you could use views to lock only necessary columns. If the continuous access and your batched access are disjoint regarding columns, it might be possible that they don't lock each other!



回答9:

I was facing the same issue using CREATE TEMPORARY TABLE ... SELECT ... with SQLSTATE[HY000]: General error: 1205 Lock wait timeout exceeded; try restarting transaction.

Based on your initial query, my problem was solved by locking the HighlyContentiousTableInInnoDb before starting the query.

LOCK TABLES HighlyContentiousTableInInnoDb READ;
INSERT INTO TemporaryTable
    (SELECT FROM HighlyContentiousTableInInnoDb
    WHERE allKindsOfComplexConditions are true)
UNLOCK TABLES;