How to request a random row in SQL?

2018-12-31 01:44发布

How can I request a random row (or as close to truly random as is possible) in pure SQL?

标签: sql random
28条回答
伤终究还是伤i
2楼-- · 2018-12-31 02:01

For MySQL to get random record

 SELECT name
  FROM random AS r1 JOIN
       (SELECT (RAND() *
                     (SELECT MAX(id)
                        FROM random)) AS id)
        AS r2
 WHERE r1.id >= r2.id
 ORDER BY r1.id ASC
 LIMIT 1

More detail http://jan.kneschke.de/projects/mysql/order-by-rand/

查看更多
牵手、夕阳
3楼-- · 2018-12-31 02:03

Best way is putting a random value in a new column just for that purpose, and using something like this (pseude code + SQL):

randomNo = random()
execSql("SELECT TOP 1 * FROM MyTable WHERE MyTable.Randomness > $randomNo")

This is the solution employed by the MediaWiki code. Of course, there is some bias against smaller values, but they found that it was sufficient to wrap the random value around to zero when no rows are fetched.

newid() solution may require a full table scan so that each row can be assigned a new guid, which will be much less performant.

rand() solution may not work at all (i.e. with MSSQL) because the function will be evaluated just once, and every row will be assigned the same "random" number.

查看更多
萌妹纸的霸气范
4楼-- · 2018-12-31 02:03

You may also try using new id() function.

Just write a your query and use order by new id() function. It quite random.

查看更多
何处买醉
5楼-- · 2018-12-31 02:05

Random function from the sql could help. Also if you would like to limit to just one row, just add that in the end.

SELECT column FROM table
ORDER BY RAND()
LIMIT 1
查看更多
十年一品温如言
6楼-- · 2018-12-31 02:06

Most of the solutions here aim to avoid sorting, but they still need to make a sequential scan over a table.

There is also a way to avoid the sequential scan by switching to index scan. If you know the index value of your random row you can get the result almost instantially. The problem is - how to guess an index value.

The following solution works on PostgreSQL 8.4:

explain analyze select * from cms_refs where rec_id in 
  (select (random()*(select last_value from cms_refs_rec_id_seq))::bigint 
   from generate_series(1,10))
  limit 1;

I above solution you guess 10 various random index values from range 0 .. [last value of id].

The number 10 is arbitrary - you may use 100 or 1000 as it (amazingly) doesn't have a big impact on the response time.

There is also one problem - if you have sparse ids you might miss. The solution is to have a backup plan :) In this case an pure old order by random() query. When combined id looks like this:

explain analyze select * from cms_refs where rec_id in 
    (select (random()*(select last_value from cms_refs_rec_id_seq))::bigint 
     from generate_series(1,10))
    union all (select * from cms_refs order by random() limit 1)
    limit 1;

Not the union ALL clause. In this case if the first part returns any data the second one is NEVER executed!

查看更多
倾城一夜雪
7楼-- · 2018-12-31 02:08

If possible, use stored statements to avoid the inefficiency of both indexes on RND() and creating a record number field.

PREPARE RandomRecord FROM "SELECT * FROM table LIMIT ?,1";
SET @n=FLOOR(RAND()*(SELECT COUNT(*) FROM table));
EXECUTE RandomRecord USING @n;
查看更多
登录 后发表回答