How can I insert random values into a SQL Server t

I'm trying to randomly insert values from a list of pre-defined values into a table for testing. I tried using the solution found on this StackOverflow question:

stackoverflow.com/.../update-sql-table-with-random-value-from-other-table

When I I tried this, all of my "random" values that are inserted are exactly the same for all 3000 records.

When I run the part of the query that actually selects the random row, it does select a random record every time I run it by hand, so I know the query works. My best guesses as to what is happening are:

SQL Server is optimizing the SELECT somehow, not allowing the subquery to be evaluated more than once
The random value's seed is the same on every record the query updates

I'm stuck on what my options are. Am I doing something wrong, or is there another way I should be doing this?

This is the code I'm using:

DECLARE @randomStuff TABLE ([id] INT, [val] VARCHAR(100))

INSERT INTO @randomStuff ([id], [val]) 
VALUES ( 1,  'Test Value 1' )
INSERT INTO @randomStuff ([id], [val])
VALUES ( 2,  'Test Value 2' )
INSERT INTO @randomStuff ([id], [val])
VALUES ( 3,  'Test Value 3' )
INSERT INTO @randomStuff ([id], [val])
VALUES ( 4,  'Test Value 4' )
INSERT INTO @randomStuff ([id], [val])
VALUES ( 5,  'Test Value 5' )
INSERT INTO @randomStuff ([id], [val])
VALUES ( 6,  null )
INSERT INTO @randomStuff ([id], [val])
VALUES ( 7,  null )
INSERT INTO @randomStuff ([id], [val])
VALUES ( 8,  null )
INSERT INTO @randomStuff ([id], [val])
VALUES ( 9,  null )
INSERT INTO @randomStuff ([id], [val])
VALUES ( 10, null )

UPDATE MyTable
SET MyColumn = (SELECT TOP 1 [val] FROM @randomStuff ORDER BY NEWID())

标签： sql sql-server tsql random

6条回答

看我几分像从前

2楼-- · 2019-01-19 00:26

There is no optimization going on here.

Your using a subquery that selects a single value, there is nothing to optimize.

You can also try putting a column from the table your updating in the select and see if that changes anything. That may trigger an evaluation for every row in MyTable

UPDATE MyTable
SET MyColumn = (SELECT TOP 1 [val] FROM @randomStuff ORDER BY NEWID()
    WHERE MyTable.MyColumn = MyTable.MyColumn )

0人赞添加讨论(0) 举报

在下西门庆

3楼-- · 2019-01-19 00:28

I don't have time to check this right now, but my gut tells me that if you were to create a function on the server to get the random value that it would not optimize it out.

then you would have

UPDATE MyTable
Set MyColumn = dbo.RANDOM_VALUE()

0人赞添加讨论(0) 举报

地球回转人心会变

4楼-- · 2019-01-19 00:36

When the query engine sees this...

(SELECT TOP 1 [val] FROM @randomStuff ORDER BY NEWID())

... it's all like, "ooooh, a cachable scalar subquery, I'm gonna cache that!"

You need to trick the query engine into thinking it's non-cachable. jfar's answer was close, but the query engine was smart enough to see the tautalogy of MyTable.MyColumn = MyTable.MyColumn, but it ain't smart enough to see through this.

UPDATE MyTable
   SET MyColumn = (SELECT TOP 1 val
                     FROM @randomStuff r
                          INNER JOIN MyTable _MT
                                  ON M.Id = _MT.Id
                    ORDER BY NEWID())
 FROM MyTable M

By bringing in the outer table (MT) into the subquery, the query engine assumes subquery will need to be re-evaluated. Anything will work really, but I went with the (assumed) primary key of MyTable.Id since it'd be indexed and would add very little overhead.

A cursor would probably be just as fast, but is most certainly not as fun.

0人赞添加讨论(0) 举报

霸刀☆藐视天下

5楼-- · 2019-01-19 00:36

use a cross join to generate random data

0人赞添加讨论(0) 举报

beautiful°

6楼-- · 2019-01-19 00:39

I came up with a solution which is a bit of a hack and very inefficient (10~ seconds to update 3000 records). Because this is being used to generate test data, I don't have to be concerned about speed however.

In this solution, I iterate over every row in the table and update the values one row at a time. It seems to work:

DECLARE @rows INT 
DECLARE @currentRow INT

SELECT @rows = COUNT(*) FROM dbo.MyTable
SET @currentRow = 1

WHILE @currentRow < @rows
BEGIN 

UPDATE MyTable
SET MyColumn = (SELECT TOP 1 [val] FROM @randomStuff ORDER BY NEWID())
WHERE MyPrimaryKey = (SELECT b.MyPrimaryKey
 FROM(SELECT a.MyPrimaryKey, ROW_NUMBER() OVER (ORDER BY MyPrimaryKey) AS rownumber
      FROM MyTable a) AS b
 WHERE @currentRow = b.rownumber
)

SET @currentRow = @currentRow + 1
END

0人赞添加讨论(0) 举报

再贱就再见

7楼-- · 2019-01-19 00:41

I've had a play with this, and found a rather hacky way to do it with the use of an intermediate table variable.

Once @randomStuff is set up, we do this (note in my case, @MyTable is a table variable, adjust accordingly for your normal table):

DECLARE @randomMappings TABLE (id INT, val VARCHAR(100), sorter UNIQUEIDENTIFIER)

INSERT INTO @randomMappings 
SELECT M.id, val, NEWID() AS sort 
FROM @MyTable AS M 
CROSS JOIN @randomstuff

so at this point, we have an intermediate table with every combination of (mytable id, random value), and a random sort value for each row specific to that combination. Then

DELETE others FROM @randomMappings AS others 
INNER JOIN @randomMappings AS lower 
ON (lower.id = others.id) AND (lower.sorter < others.sorter)

This is an old trick which deletes all rows for a given MyTable.id except for the one with the lower sort value -- join the table to itself where the value is smaller, and delete any where such a join succeeded. This just leaves behind the lowest value. So for each MyTable.id, we just have one (random) value left.. Then we just plug it back into the table:

UPDATE @MyTable
SET MyColumn = random.val
FROM @MyTable m, @randomMappings AS random
WHERE (random.id = m.id)

And you're done!

I said it was hacky...

0人赞添加讨论(0) 举报

How can I insert random values into a SQL Server t

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间