How can I request a random row (or as close to truly random as is possible) in pure SQL?
相关问题
- SQL join to get the cartesian product of 2 columns
- sql execution latency when assign to a variable
- Difference between Types.INTEGER and Types.NULL in
- php PDO::FETCH_ASSOC doesnt detect select after ba
- Bulk update SQL Server C#
With SQL Server 2012+ you can use the OFFSET FETCH query to do this for a single random row
where id is an identity column, and n is the row you want - calculated as a random number between 0 and count()-1 of the table (offset 0 is the first row after all)
This works with holes in the table data, as long as you have an index to work with for the ORDER BY clause. Its also very good for the randomness - as you work that out yourself to pass in but the niggles in other methods are not present. In addition the performance is pretty good, on a smaller dataset it holds up well, though I've not tried serious performance tests against several million rows.
I don't know how efficient this is, but I've used it before:
Because GUIDs are pretty random, the ordering means you get a random row.
Insted of using RAND(), as it is not encouraged, you may simply get max ID (=Max):
get a random between 1..Max (=My_Generated_Random)
and then run this SQL:
Note that it will check for any rows which Ids are EQUAL or HIGHER than chosen value. It's also possible to hunt for the row down in the table, and get an equal or lower ID than the My_Generated_Random, then modify the query like this:
Didn't quite see this variation in the answers yet. I had an additional constraint where I needed, given an initial seed, to select the same set of rows each time.
For MS SQL:
Minimum example:
Normalized execution time: 1.00
NewId() example:
Normalized execution time: 1.02
NewId()
is insignificantly slower thanrand(checksum(*))
, so you may not want to use it against large record sets.Selection with Initial Seed:
If you need to select the same set given a seed, this seems to work.