set.seed with R 2.15.2

2019-04-27 17:53发布

My understanding is that using set.seed ensures reproducibility but this is not the case with the following R code in R 2.15.2. Am I missing something here?

set.seed(12345)
rnorm(5)
[1]  0.5855288  0.7094660 -0.1093033 -0.4534972  0.6058875
 rnorm(5)
[1] -1.8179560  0.6300986 -0.2761841 -0.2841597 -0.9193220

3条回答
可以哭但决不认输i
2楼-- · 2019-04-27 18:51

Any call that uses the random number generator will change the current seed, even if you've manually set it with set.seed.

set.seed(1)
x <- .Random.seed # get the current seed
runif(10) # uses random number generator, so changes current seed
y <- .Random.seed
identical(x, y) # FALSE

As @StephanKolassa demonstrates, you'd have to reset the seed before each use of the random number generator to guarantee that it uses the same one each time.

查看更多
倾城 Initia
3楼-- · 2019-04-27 18:53

set.seed() reinitializes the random number generator.

set.seed(12345)
rnorm(5)
[1]  0.5855288  0.7094660 -0.1093033 -0.4534972  0.6058875

set.seed(12345)
rnorm(5)
[1]  0.5855288  0.7094660 -0.1093033 -0.4534972  0.6058875

set.seed(12345)
rnorm(5)
[1]  0.5855288  0.7094660 -0.1093033 -0.4534972  0.6058875
查看更多
我只想做你的唯一
4楼-- · 2019-04-27 18:58

It's worth underlining here that the sequence of numbers is still reproducible each time you set the seed, because of this reinitialisation.

So although with each subsequent call to e.g. rnorm you're getting different answers to each call, you're still going to get the same sequence of numbers from the point the seed was set.

E.g., per the original question:

set.seed(12345)
rnorm(5)
[1]  0.5855288  0.7094660 -0.1093033 -0.4534972  0.6058875
rnorm(5)
[1] -1.8179560  0.6300986 -0.2761841 -0.2841597 -0.9193220

Produces the same sequence of 10 numbers as:

set.seed(12345)
rnorm(10)
[1]  0.5855288  0.7094660 -0.1093033 -0.4534972  0.6058875
-1.8179560  0.6300986 -0.2761841 -0.2841597 -0.9193220

Or

set.seed(12345)
rnorm(7)
[1]  0.5855288  0.7094660 -0.1093033 -0.4534972  0.6058875
-1.8179560  0.6300986
rnorm(3)
[1] -0.2761841 -0.2841597 -0.9193220

Or whatever series of calls to rnorm.

The point here is that if you set the seed once at the start of a script you will get the same set of random numbers generated each time you run that whole script, while getting a different set of numbers from each random number generator call within the code. This is because you are running on the same sequence from that seed at the start. This can be a good thing, and it means that if you want a reproducible script you can set the seed once at the beginning.

查看更多
登录 后发表回答