I remember reading somewhere that using order by rand() is bad and I just startpaged it and found an article that proves it. Order by rand() can be extremely slow with large databases and the suggested solution was to generate a random number in php and select based upon it. The problem is that I need to verify other fields in order to return my records. I may also have some old records deleted, that may also cause an issue. Can anyone provide a decent way to select a few random records from a table that match certain conditions ( for example field paid
must be equal to 1 ) ?
问题:
回答1:
The reason that ordering by RAND()
can be slow is that you're forcing the database to actually sort the whole table before returning anything. Just reducing the load to a single table scan is much faster (albeit still somewhat slow).
This means that you could get part of the way just by avoiding the ordering:
SELECT *
FROM my_table
WHERE RAND() < 0.1
ORDER BY RAND()
LIMIT 100
This will select approximately 1% of all the rows in the table, sort them and return the top 100. Just note that the main issue here (as well as with @cmd's answer) is that you can't be sure that the query returns anything at all.
The approach above should involve a whole table scan (to decide which rows to use) followed by a sort of approximately 1% of the rows. If you have a lot of rows, you can reduce the percentage accordingly.
回答2:
How random do you need them to be? if you dont need a super even distribution try this
select min(pk_id) from my_table where pk_id > %(random_number)s and paid=1
where %(random_number)s
is a bind variable containing a random number from 0
to max(pk_id)-1
regenerated each time you run the query