How to sample rows in MySQL using RAND(seed)?

2019-07-04 06:52发布

问题:

I need to fetch a repeatable random set of rows from a table using MySQL. I implemented this using the MySQL RAND function using the bigint primary key of the row as the seed. Interestingly this produces numbers that don't look random at all. Can anyone tell me whats going on here and how to get it to work properly?

select id from foo where rand(id) < 0.05 order by id desc limit 100

In one example out of 600 rows not a single one was returned. I change the select to include "id, rand(id)" and get rid of the rand clause in the where this is what I got:

| 163345 |  0.315191733944408 |
| 163343 |  0.814825518815616 |
| 163337 |  0.313726862253367 |
| 163334 |  0.563177533972242 |
| 163333 |  0.312994424545201 |
| 163329 |  0.312261986837035 |
| 163327 |  0.811895771708242 |
| 163322 |  0.560980224573035 |
| 163321 |  0.310797115145994 |
| 163319 |  0.810430896291911 |
| 163318 |  0.560247786864869 |
| 163317 |  0.310064677437828 |

Look how many 0.31xxx lines there are. Not at all random.

PS: I know this is slow but in my app the where clause limits the number of rows to a few 1000.

回答1:

Use the same seed for all the rows to do that, like:

select id from foo where rand(42) < 0.05 order by id desc limit 100

See the rand() docs for why it works that way. Change the seed if you want another set of values.



回答2:

Multiply the decimal number returned by id:

select id from foo where rand() * id < 5 order by id desc limit 100


标签: mysql random