How to randomly select rows in SQL?

2019-01-01 06:30发布

问题:

I am using MSSQL Server 2005. In my db, I have a table \"customerNames\" which has two columns \"Id\" and \"Name\" and approx. 1,000 results.

I am creating a functionality where I have to pick 5 customers randomly every time. Can anyone tell me how to create a query which will get random 5 rows (Id, and Name) every time when query is executed?

回答1:

SELECT TOP 5 Id, Name FROM customerNames
ORDER BY NEWID()

That said, everybody seems to come to this page for the more general answer to your question:

Selecting a random row in SQL

Select a random row with MySQL:

SELECT column FROM table
ORDER BY RAND()
LIMIT 1

Select a random row with PostgreSQL:

SELECT column FROM table
ORDER BY RANDOM()
LIMIT 1

Select a random row with Microsoft SQL Server:

SELECT TOP 1 column FROM table
ORDER BY NEWID()

Select a random row with IBM DB2

SELECT column, RAND() as IDX 
FROM table 
ORDER BY IDX FETCH FIRST 1 ROWS ONLY

Select a random record with Oracle:

SELECT column FROM
( SELECT column FROM table
ORDER BY dbms_random.value )
WHERE rownum = 1

Select a random row with sqlite:

SELECT column FROM table 
ORDER BY RANDOM() LIMIT 1


回答2:

SELECT TOP 5 Id, Name FROM customerNames ORDER BY NEWID()


回答3:

In case someone wants a PostgreSQL solution:

select id, name
from customer
order by random()
limit 5;


回答4:

Maybe this site will be of assistance.

For those who don\'t want to click through:

SELECT TOP 1 column FROM table
ORDER BY NEWID()


回答5:

There is a nice Microsoft SQL Server 2005 specific solution here. Deals with the problem where you are working with a large result set (not the question I know).

Selecting Rows Randomly from a Large Table http://msdn.microsoft.com/en-us/library/cc441928.aspx



回答6:

I have found this to work best for big data.

`SELECT TOP 1 Column_Name FROM dbo.Table TABLESAMPLE(1 PERCENT);`

TABLESAMPLE(n ROWS) or TABLESAMPLE(n PERCENT) is random but need to add the TOP n to get the correct sample size.

Using NEWID() is very slow on large tables.



回答7:

SELECT * FROM TABLENAME ORDER BY random() LIMIT 5;



回答8:

If you have a table with millions of rows and care about the performance, this could be a better answer:

SELECT * FROM Table1
WHERE (ABS(CAST(
  (BINARY_CHECKSUM
  (keycol1, NEWID())) as int))
  % 100) < 10

https://msdn.microsoft.com/en-us/library/cc441928.aspx



回答9:

This is an old question, but attempting to apply a new field (either NEWID() or ORDER BY rand()) to a table with a large number of rows would be prohibitively expensive. If you have incremental, unique IDs (and do not have any holes) it will be more efficient to calculate the X # of IDs to be selected instead of applying a GUID or similar to every single row and then taking the top X # of.

DECLARE @maxValue int = (select max(id) from [TABLE])
DECLARE @minValue int = (select min(id) from [TABLE])
DECLARE @randomId1 int, @randomId2 int, @randomId3 int, @randomId4 int, @randomId5 int
SET @randomId1 = ((@maxValue + 1) - @minValue) * Rand() + @minValue
SET @randomId2 = ((@maxValue + 1) - @minValue) * Rand() + @minValue
SET @randomId3 = ((@maxValue + 1) - @minValue) * Rand() + @minValue
SET @randomId4 = ((@maxValue + 1) - @minValue) * Rand() + @minValue
SET @randomId5 = ((@maxValue + 1) - @minValue) * Rand() + @minValue

--select @maxValue as MaxValue, @minValue as MinValue
--  , @randomId1 as SelectedId1
--  , @randomId2 as SelectedId2
--  , @randomId3 as SelectedId3
--  , @randomId4 as SelectedId4
--  , @randomId5 as SelectedId5


select *from[TABLE] el
    where el.id in (@randomId1, @randomId2, @randomId3, @randomId4, @randomId5)

If you wanted to select many more rows I would look into populating a #tempTable with an ID and a bunch of rand() values then using each rand() value to scale to the min-max values. That way you do not have to define all of the @randomId1...n parameters. I\'ve included an example below using a cte to populate the initial table.

DECLARE @NumItems int = 100;

DECLARE @maxValue int = (select max(id) from [TABLE])
DECLARE @minValue int = (select min(id) from [TABLE])

with cte (n) as (select 1 union all select n+1 from cte where n < @NumItems)
select cast( ((@maxValue + 1) - @minValue) * rand(cast(newid() as varbinary(100))) + @minValue as int) as tp into #Nt
from 
cte
select * from  #Nt ntt
inner join [TABLE] i 
        on i.id = ntt.tp
drop table #Nt