I have a set of consecutive rows I want to get based upon their primary key, which is an auto-incrementing integer. Assuming that there are no holes, is there any performance between between:
SELECT * FROM `theTable` WHERE `id` IN (n, ... nk);
and:
SELECT * FROM `theTable` WHERE `id` BETWEEN n AND nk;
BETWEEN
should outperform IN
in this case (but do measure and check execution plans, too!), especially as n
grows and as statistics are still accurate. Let's assume:
m
is the size of your table
n
is the size of your range
Index can be used (n
is tiny compared to m
)
In theory, BETWEEN
can be implemented with a single "range scan" (Oracle speak) on the primary key index, and then traverse at most n
index leaf nodes. The complexity will be O(n + log m)
IN
is usually implemented as a series (loop) of n
"range scans" on the primary key index. With m
being the size of the table, the complexity will always be O(n * log m)
... which is always worse (neglibile for very small tables m
or very small ranges n
)
Index cannot be used (n
is a significant portion of m
)
In any case, you'll get a full table scan and evaluate the predicate on each row:
BETWEEN
needs to evaluate two predicates: One for the lower and one for the upper bound. The complexity is O(m)
IN
needs to evaluate at most n
predicates. The complexity is O(m * n)
... which is again always worse, or perhaps O(m)
if the database can optimise the IN
list to be a hashmap, rather than a list of predicates.
a between b and c
is a macro that expands to b <= a and a <= c
.
a in (b,c,d)
is a macro that expands to a=b or a=c or a=d
.
Assuming your n
and nk
are integer, both should end up meaning the same. The between
variant should be much faster because it's only two compares, versus nk - n
compares for the in
variant.
I have done research for this question.
I have 11M rows in my table. I have executed two queries on that:
Query 1:SELECT * FROM PLAYERS WHERE SCORE BETWEEN 10 TO 20
Query 2:SELECT * FROM PLAYERS WHERE SCORE IN (10,11,...,20)
While execution time, both queries are translated as Andomar said above.
Among both queries, Query 1 is running faster than Query 2.
To know more follow this link:
Performance of BETWEEN VS IN() in MySQL
Thank you.