I am wondering if there is any difference in regards to performance between the following
SELECT ... FROM ... WHERE someFIELD IN(1,2,3,4)
SELECT ... FROM ... WHERE someFIELD between 0 AND 5
SELECT ... FROM ... WHERE someFIELD = 1 OR someFIELD = 2 OR someFIELD = 3 ...
or will MySQL optimize the SQL in the same way compilers will optimize code ?
EDIT: Changed the AND
's to OR
's for the reason stated in the comments.
I'll bet they are the same, you can run a test by doing the following:
loop over the "in (1,2,3,4)" 500 times and see how long it takes. loop over the "=1 or =2 or=3..." version 500 times and seeing how long it runs.
you could also try a join way, if someField is an index and your table is big it could be faster...
I tried the join method above on my SQL Server and it is nearly the same as the in (1,2,3,4), and they both result in a clustered index seek. I'm not sure how MySQL will handle them.
Below are details of 6 queries using MySQL 5.6 @SQLFiddle
In summary the 6 queries cover independently indexed columns and 2 queries were used per data type. All queries resulted in use of an index regardless of IN() or ORs being used.
I really just wanted to debunk statements made that OR means no index can be used. This isn't true. Indexes can be used in queries using OR as the 6 queries in the following examples display.
Also it seems to me that many have ignored the fact that IN() is a syntax shortcut for a set of ORs. At small scale perfomance differences between using IN() -v- OR are extremely (infintessinally) marginal.
While at larger scale IN() is certainly more convenient, but it sill equates to a set of OR conditions logically. Circumstance change for each query so testing your query on your tables is always best.
Summary of the 6 explain plans, all "Using index condition" (scroll right)
SQL Fiddle
MySQL 5.6 Schema Setup:
.
Query 1:
Results:
Query 2:
Results:
Query 3:
Results:
Query 4:
Results:
Query 5:
Results:
Query 6:
Results:
I think one explanation to sunseeker's observation is MySQL actually sort the values in the IN statement if they are all static values and using binary search, which is more efficient than the plain OR alternative. I can't remember where I've read that, but sunseeker's result seems to be a proof.
Just when you thought it was safe...
What is your value of
eq_range_index_dive_limit
? In particular, do you have more or fewer items in theIN
clause?This will not include a Benchmark, but will peer into the inner workings a little. Let's use a tool to see what is going on -- Optimizer Trace.
The query:
SELECT * FROM canada WHERE id ...
With an
OR
of 3 values, part of the trace looks like:...
...
Note how ICP is being given
ORs
. This implies thatOR
is not turned intoIN
, and InnoDB will be performing a bunch of=
tests through ICP. (I do not feel it is worth considering MyISAM.)(This is Percona's 5.6.22-71.0-log;
id
is a secondary index.)Now for IN() with a few values
eq_range_index_dive_limit
= 10; there are 8 values....
...
Note that the
IN
does not seem to be turned intoOR
.A side note: Notice that the constant values were sorted. This can be beneficial in two ways:
Finally, IN() with a lots of values
...
...
Side note: I needed this due to the bulkiness of the trace:
I think the BETWEEN will be faster since it should be converted into:
It is my understanding that an IN will be converted to a bunch of OR statements anyway. The value of IN is the ease of use. (Saving on having to type each column name multiple times and also makes it easier to use with existing logic - you don't have to worry about AND/OR precedence because the IN is one statement. With a bunch of OR statements, you have to ensure you surround them with parentheses to make sure they are evaluated as one condition.)
The only real answer to your question is PROFILE YOUR QUERIES. Then you will know what works best in your particular situation.
From what I understand about the way that the compiler optimizes these types of queries, using the IN clause is more efficient than multiple OR clauses. If you have values where the BETWEEN clause can be used, that is more efficient still.