I have a table v_ext in a MySQL with InnoDB engine:
- id: primary key
- code: pre-generated list of codes (say 1000 codes are generated randomly)
- user_id: initially NULL
When a user purchase an item, they receive a code. I need to update the table to populate the user_id column. I have two options:
START TRANSACTION;
SELECT id FROM v_ext WHERE user_id IS NULL LIMIT 1 FOR UPDATE; -- return id 54 for ex.
UPDATE v_ext SET user_id=xxx WHERE id=54;
COMMIT;
or
UPDATE v_ext SET user_id=xxx WHERE user_id IS NULL LIMIT 1;
Is the second option safe if I have thousands of users purchasing at the same time? If so, is it correct to assume this second option is better for performance since it needs only one query?
Since I didn't get an answer, I started doing benchmarking. My criteria are as follows:
- 20,000 pre-generated codes
- Use of Apache
ab
command with 20,000 requests, 100 concurrency: ab -n 20000 -c 100
- Servlet -> EJB (JPA 2.0 EclipseLink, JTA) to perform the update in DB (as it will be through a JSF action in real situation)
- 2 versions of the Servlet, one with option 1 (SELECT ... FOR UPDATE ), and one with option 2 (UPDATE ... LIMIT 1)
- Stopped Glassfish, hit the tested Servlet manually 5 times to warm it up, reset all to NULL to user_id
- Tests are run 3 times each and average is provided
Results:
SELECT ... FOR UPDATE; UPDATE ... :
Concurrency Level: 100
Time taken for tests: 758.116 seconds
Complete requests: 20000
Failed requests: 0
Write errors: 0
Row updated: 20000
UPDATE.... LIMIT 1:
Concurrency Level: 100
Time taken for tests: 773.659 seconds
Complete requests: 20000
Failed requests: 0
Write errors: 0
Row updated: 20000
So at least on my system, the option with 2 queries seems more efficient than the one query. I didn't expect that :)