InnoDB deadlock with read-uncommited! - Java - Gla

It's been several days that I got deadlock issues on a Java application with Glassfish - EJB3 with Mysql InnoDB

Config: Mysql InnoDB: Ver 14.12 Distrib 5.0.51a, for debian-linux-gnu (i486) using readline 5.2

Application server: Glassfish v2.1

Persistence with EJB3 - JPA - Hibernate

To make it simple I have - a SOA system with servlet that handle subscriptions of users to services, logon, logoff, payments and registration etc... - a quartz job system (cron triggers) that handle the daily decrementation of these services, the generation of "low credit" warnings, the validation of payments etc...

My problem: I've got deadlocks everywhere during load-testings (100 000 users simulation - 30 requests / second)

Stack sample returned:

Message ID: 
Could not synchronize database state with session org.hibernate.exception.LockAcquisitionException

Complete Message:   
Could not execute JDBC batch update at org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:105) at 
org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:66) at
org.hibernate.jdbc.AbstractBatcher.executeBatch(AbstractBatcher.java:275) at 
org.hibernate.jdbc.AbstractBatcher.prepareStatement(AbstractBatcher.java:114) at 
org.hibernate.jdbc.AbstractBatcher.prepareStatement(AbstractBatcher.java:109) at 
org.hibernate.jdbc.AbstractBatcher.prepareBatchStatement(AbstractBatcher.java:244) at 
org.hibernate.persister.entity.AbstractEntityPersister.update(AbstractEntityPersister.java:2382) at 
org.hibernate.persister.entity.AbstractEntityPersister.updateOrInsert(AbstractEntityPersister.java:2335) at 
org.hibernate.persister.entity.AbstractEntityPersister.update(AbstractEntityPersister.java:2635) at 
org.hibernate.action.EntityUpdateAction.execute(EntityUpdateAction.java:115) at 
org.hibernate.engine.ActionQueue.execute(ActionQueue.java:279) at 
org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:263) at 
org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:168) at 
org.hibernate.event.def.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:321) at 
org.hibernate.event.def.DefaultAutoFlushEventListener.onAutoFlush(DefaultAutoFlushEventListener.java:64) at
org.hibernate.impl.SessionImpl.autoFlushIfRequired(SessionImpl.java:996) at 
org.hibernate.impl.SessionImpl.list(SessionImpl.java:1141) at 
org.hibernate.impl.QueryImpl.list(QueryImpl.java:102) at 
org.hibernate.ejb.QueryImpl.getResultList(QueryImpl.java:67) at
net.xxx.server.dao.impl.PaymentDAOImpl.listPaymentsByStateAndCompany(PaymentDAOImpl.java:270)

Notice the end, it's code I've done: net.xxx.server.dao.impl.PaymentDAOImpl.listPaymentsByStateAndCompany(PaymentDAOImpl.java:270)

This function:

private static final String QUERY_FOR_PAYMENTS_BY_STATE_AND_COMPANY = " FROM " + Payment.class.getName()
        + " p WHERE p.serviceDefinition.company=:company"
        + " AND p.state = :state";

    @SuppressWarnings("unchecked")
    public List<Payment> listPaymentsByStateAndCompany(Company company,Constants.PaymentState state) {
        List<Payment> payments = this.getEntityManager()
        .createQuery(QUERY_FOR_PAYMENTS_BY_STATE_AND_COMPANY)
        .setParameter("state",state.ordinal())
        .setParameter("company",company)
        .getResultList();
        return payments;
    }

This function works perfectly well when it's not load testing, and that we have 1 request each 5 seconds for example.

During load testing, we have jobs running with high frequencies (every 5 seconds for example).

I don't get only this error, but some others on other jobs too (still deadlock)!

On MYSQL:

Example of deadlock:

------------------------
LATEST DETECTED DEADLOCK
------------------------
090428 12:21:11
*** (1) TRANSACTION:
TRANSACTION 0 14286818, ACTIVE 0 sec, process no 21872, OS thread id 802850 starting index read
mysql tables in use 1, locked 1
LOCK WAIT 13 lock struct(s), heap size 1024, undo log entries 2
MySQL thread id 298, query id 11843357 localhost 127.0.0.1 root Updating
/*  */ update service set balance=40.0, company_id=2, last_on='2009-04-28 12:19:55', modified_by='server', modified_on='2009-04-28 12:21:11', service_definition_id=3, state=1, subscriber_id=13578, valid_until='2010-02-22 12:13:52' where service_id=693
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 62 n bits 176 index `PRIMARY` of table `xxx/service` trx id 0 14286818 lock_mode X locks rec but not gap waiting
Record lock, heap no 98 PHYSICAL RECORD: n_fields 12; compact format; info bits 0
 0: len 8; hex 80000000000002b5; asc         ;; 1: len 6; hex 000000d9faa0; asc       ;; 2: len 7; hex 0000000cc91e70; asc       p;; 3: len 4; hex 00001c42; asc    B;; 4: len 8; hex 80001245aad4e363; asc    E   c;; 5: len 6; hex 736572766572; asc server;; 6: len 8; hex 80001245aad4e3c9; asc    E    ;; 7: len 1; hex 81; asc  ;; 8: len 8; hex 80001247f200df08; asc    G    ;; 9: len 8; hex 8000000000000002; asc         ;; 10: len 8; hex 8000000000000003; asc         ;; 11: len 8; hex 800000000000350a; asc       5 ;;

*** (2) TRANSACTION:
TRANSACTION 0 14286798, ACTIVE 1 sec, process no 24963, OS thread id 393239 starting index read, thread declared inside InnoDB 500
mysql tables in use 1, locked 1
17 lock struct(s), heap size 1024, undo log entries 16
MySQL thread id 253, query id 11843359 localhost 127.0.0.1 root Updating
/*  */ update payment set credit=1.0, currency='EUR', modified_by='9999900092', modified_on='2009-04-28 12:21:11', payment_definition_id=7, price=1.0, service_definition_id=3, state=0, subscriber_id=13578, transaction_id=11463 where payment_id=15914
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 0 page no 62 n bits 176 index `PRIMARY` of table `xxx/service` trx id 0 14286798 lock mode S locks rec but not gap
Record lock, heap no 47 PHYSICAL RECORD: n_fields 12; compact format; info bits 0
 0: len 8; hex 8000000000000286; asc         ;; 1: len 6; hex 000000d9ffce; asc       ;; 2: len 7; hex 0000000cc90683; asc        ;; 3: len 4; hex 0000f841; asc    A;; 4: len 8; hex 80001245aad4e3b2; asc    E    ;; 5: len 6; hex 736572766572; asc server;; 6: len 8; hex 80001245aad4e3ff; asc    E    ;; 7: len 1; hex 81; asc  ;; 8: len 8; hex 80001245d450fed8; asc    E P  ;; 9: len 8; hex 8000000000000002; asc         ;; 10: len 8; hex 8000000000000003; asc         ;; 11: len 8; hex 80000000000034db; asc       4 ;;

Transaction Isolation

I read things on internet about transaction isolation.

On glassfish we can setup transaction isolation level, I put it read-uncommitted.

It didn't work, then I set in mysql the same level:

mysql> SELECT @@global.tx_isolation;
+-----------------------+
| @@global.tx_isolation |
+-----------------------+
| READ-UNCOMMITTED      | 
+-----------------------+
1 row in set (0.00 sec)

mysql> SELECT @@tx_isolation;
+------------------+
| @@tx_isolation   |
+------------------+
| READ-UNCOMMITTED | 
+------------------+
1 row in set (0.00 sec)

SVP can someone tell my what could be the problem? I really don't know!!!!

BTW I've seen on internet you can select your transaction isolation level for each request... Is it possible to set transaction isolation level for methods on JPA directly? Cause i think only jobs that do global data update (like decrementing 15000 services) should be read-uncommitted am I wrong?

回答1:

I don't have an exact answer for your problem, but this may help you narrowing it down.

Deadlocks can happen in any transaction isolation level because innodb will set locks on updates even on "read uncommitted".

You can test this with this simple scenario:

CREATE TABLE locktest (a INT(11), b INT(11), PRIMARY KEY (a)) ENGINE=INNODB;
INSERT INTO locktest VALUE (1, 1);
INSERT INTO locktest VALUE (2, 1);

Then, open 2 mysql consoles (C1 and C2) and run these commands in order:

C1> BEGIN;
C2> BEGIN;
C1> UPDATE locktest SET b = b + 1 WHERE a = 1;
C2> UPDATE locktest SET b = b + 1 WHERE a = 2;
C1> UPDATE locktest SET b = b + 1 WHERE a = 2;
C2> UPDATE locktest SET b = b + 1 WHERE a = 1;

You will see a deadlock on C2, and C1 completing successfully even on read uncommitted. If you check the engine log you will see a similar report.

If you remove the primary key on the table the commands choke even earlier and this is because innodb locking works better if there's an index covering the query that is setting the locks.

So, going back to your problem.

You should check all the queries involved in transactions that ended in deadlocks and make sure that the appropriate indexes exist. If MySQL has to do a full table scan it will end locking more than it needs.

These tips helped me solve some deadlocks in my app. A good way of preventing deadlocks is setting a write lock with "SELECT ... FOR UPDATE" to lock some parent row.

So for example, if you have multiple transactions trying to update some specific customer data you could issue a "SELECT id FROM customer WHERE id=123 FOR UPDATE", they will wait in order at that point instead of ending holding locks that each other needs.

回答2:

I was having a similar problem with an Java application using an Oracle database.

I found that the database was missing indexes on foreign keys, which caused the database to lock many more rows than required, and led to deadlocks in a highly concurrent test.

Here is a very good article for diagnosing this. Much of the article is oracle-specific, but some things apply to other relational databases too: http://www.oratechinfo.co.uk/deadlocks.html

回答3:

You should take for granted that deadlocks will happen in any transactional database no matter what you do. You just should handle them gracefully and repeat the failing transaction some fixed number of times (3 normally is ok) - there should be a parameter somewhere in Glassfish responsible for that.