optimizing mysql order by rand()

I can't find a good answer to my problem.

I have a mysql query with an inner join and an order by rand() and a limit X. When I remove the order by rand() the query is 10 times faster. Is there a more efficient way to get a random subset of 500 rows? Heres a sample query.

Select * from table1 
inner join table2 on table1.in = table2.in
where table1.T = A
order by rand()
limit 500;

标签： mysql limit

2条回答

家丑人穷心不美

2楼-- · 2019-08-31 07:07

This should help:

Select *
from table1 inner join
     table2
     on table1.in = table2.in
where table1.T = A and rand() < 1000.0/20000.0
order by rand()
limit 500

This will limit the result set to about 1000 random rows before extracting a random sample of 500. The purpose of getting more rows than expected is just to be sure that you get a large enough sample size.

Here is an alternative strategy, building off the "create your own indexes" approach.

Create a temporary table using the following query:

create temporary table results as
(Select *, @rn := @rn + 1 as rn
from table1 inner join
     table2
     on table1.in = table2.in cross join
     (select @rn := 0) const
where table1.T = A
);

You now have a row number column. And, you can return the number of rows with:

select @rn;

Then you can generate the ids in your application.

I would be inclined to keep the processing in the database, using these two queries:

create temporary table results as
(Select *, @rn := @rn + 1 as rn, rand() as therand
from table1 inner join
     table2
     on table1.in = table2.in cross join
     (select @rn := 0) const
where table1.T = A
);

select *
from results
where therand < 1000/@rn
order by therand
limit 500;

0人赞添加讨论(0) 举报

optimizing mysql order by rand()

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间