I can't find a good answer to my problem.
I have a mysql
query with an inner join
and an order by rand()
and a limit X
. When I remove the order by rand()
the query is 10 times faster. Is there a more efficient way to get a random subset of 500 rows? Heres a sample query.
Select * from table1
inner join table2 on table1.in = table2.in
where table1.T = A
order by rand()
limit 500;
This should help:
This will limit the result set to about 1000 random rows before extracting a random sample of 500. The purpose of getting more rows than expected is just to be sure that you get a large enough sample size.
Here is an alternative strategy, building off the "create your own indexes" approach.
Create a temporary table using the following query:
You now have a row number column. And, you can return the number of rows with:
Then you can generate the ids in your application.
I would be inclined to keep the processing in the database, using these two queries:
A good way is to do it in application level in 2 steps:
offset
to yourLIMIT
Try it and measure if performance is acceptable for you.