Select n random rows from SQL Server table

2019-01-01 01:44发布

I've got a SQL Server table with about 50,000 rows in it. I want to select about 5,000 of those rows at random. I've thought of a complicated way, creating a temp table with a "random number" column, copying my table into that, looping through the temp table and updating each row with RAND(), and then selecting from that table where the random number column < 0.1. I'm looking for a simpler way to do it, in a single statement if possible.

This article suggest using the NEWID() function. That looks promising, but I can't see how I could reliably select a certain percentage of rows.

Anybody ever do this before? Any ideas?

15条回答
泪湿衣
2楼-- · 2019-01-01 01:56

Try this:

SELECT TOP 10 Field1, ..., FieldN
FROM Table1
ORDER BY NEWID()
查看更多
余生请多指教
3楼-- · 2019-01-01 01:56

I was using it in subquery and it returned me same rows in subquery

 SELECT  ID ,
            ( SELECT TOP 1
                        ImageURL
              FROM      SubTable 
              ORDER BY  NEWID()
            ) AS ImageURL,
            GETUTCDATE() ,
            1
    FROM    Mytable

then i solved with including parent table variable in where

SELECT  ID ,
            ( SELECT TOP 1
                        ImageURL
              FROM      SubTable 
              Where Mytable.ID>0
              ORDER BY  NEWID()
            ) AS ImageURL,
            GETUTCDATE() ,
            1
    FROM    Mytable

Note the where condtition

查看更多
ら面具成の殇う
4楼-- · 2019-01-01 01:57

The server-side processing language in use (eg PHP, .net, etc) isn't specified, but if it's PHP, grab the required number (or all the records) and instead of randomising in the query use PHP's shuffle function. I don't know if .net has an equivalent function but if it does then use that if you're using .net

ORDER BY RAND() can have quite a performance penalty, depending on how many records are involved.

查看更多
只靠听说
5楼-- · 2019-01-01 02:07

Just order the table by a random number and obtain the first 5,000 rows using TOP.

SELECT TOP 5000 * FROM [Table] ORDER BY newid();

UPDATE

Just tried it and a newid() call is sufficent - no need for all the casts and all the math.

查看更多
看风景的人
6楼-- · 2019-01-01 02:11

In MySQL you can do this:

SELECT `PRIMARY_KEY`, rand() FROM table ORDER BY rand() LIMIT 5000;
查看更多
旧人旧事旧时光
7楼-- · 2019-01-01 02:12

Selecting Rows Randomly from a Large Table on MSDN has a simple, well-articulated solution that addresses the large-scale performance concerns.

  SELECT * FROM Table1
  WHERE (ABS(CAST(
  (BINARY_CHECKSUM(*) *
  RAND()) as int)) % 100) < 10
查看更多
登录 后发表回答