How to request a random row in SQL?

2018-12-31 01:44发布

How can I request a random row (or as close to truly random as is possible) in pure SQL?

标签: sql random
28条回答
怪性笑人.
2楼-- · 2018-12-31 02:18

With SQL Server 2012+ you can use the OFFSET FETCH query to do this for a single random row

select  * from MyTable ORDER BY id OFFSET n ROW FETCH NEXT 1 ROWS ONLY

where id is an identity column, and n is the row you want - calculated as a random number between 0 and count()-1 of the table (offset 0 is the first row after all)

This works with holes in the table data, as long as you have an index to work with for the ORDER BY clause. Its also very good for the randomness - as you work that out yourself to pass in but the niggles in other methods are not present. In addition the performance is pretty good, on a smaller dataset it holds up well, though I've not tried serious performance tests against several million rows.

查看更多
爱死公子算了
3楼-- · 2018-12-31 02:19

I don't know how efficient this is, but I've used it before:

SELECT TOP 1 * FROM MyTable ORDER BY newid()

Because GUIDs are pretty random, the ordering means you get a random row.

查看更多
浪荡孟婆
4楼-- · 2018-12-31 02:19

Insted of using RAND(), as it is not encouraged, you may simply get max ID (=Max):

SELECT MAX(ID) FROM TABLE;

get a random between 1..Max (=My_Generated_Random)

My_Generated_Random = rand_in_your_programming_lang_function(1..Max);

and then run this SQL:

SELECT ID FROM TABLE WHERE ID >= My_Generated_Random ORDER BY ID LIMIT 1

Note that it will check for any rows which Ids are EQUAL or HIGHER than chosen value. It's also possible to hunt for the row down in the table, and get an equal or lower ID than the My_Generated_Random, then modify the query like this:

SELECT ID FROM TABLE WHERE ID <= My_Generated_Random ORDER BY ID DESC LIMIT 1
查看更多
其实,你不懂
5楼-- · 2018-12-31 02:21

Didn't quite see this variation in the answers yet. I had an additional constraint where I needed, given an initial seed, to select the same set of rows each time.

For MS SQL:

Minimum example:

select top 10 percent *
from table_name
order by rand(checksum(*))

Normalized execution time: 1.00

NewId() example:

select top 10 percent *
from table_name
order by newid()

Normalized execution time: 1.02

NewId() is insignificantly slower than rand(checksum(*)), so you may not want to use it against large record sets.

Selection with Initial Seed:

declare @seed int
set @seed = Year(getdate()) * month(getdate()) /* any other initial seed here */

select top 10 percent *
from table_name
order by rand(checksum(*) % seed) /* any other math function here */

If you need to select the same set given a seed, this seems to work.

查看更多
登录 后发表回答