Is there a scope for (numpy) random seeds?

2020-07-11 08:20发布

问题:

My question is related to What is the scope of a random seed in Python? . In the case of above question, it is clarified that there is a (hidden) global Random() instance in the module for random.

1) I would like to clarify whether setting the random seed in one module will cause this to be the random seed in other modules and whether there are certain things to be aware of.

For instance: Given: moduleA.py, moduleB.py

moduleA.py:

import random 
import moduleB
random.seed(my_seed)
moduleB.randomfct()

moduleB.py:

import random 
def randomfct():
    #do_things_using_random

Does moduleB also use my_seed, or do I have to pass the seed to moduleB.py and set it again?

2) Does the order of setting the random seed / importing play any role?

For example in moduleA.py:

import random 
random.seed(my_seed)
import moduleB

3) Is this also the case for setting numpy random seeds, e.g. np.random.seed(42)?

回答1:

The CPython random.py implementation is very readable. I recommend having a look: https://github.com/python/cpython/blob/3.6/Lib/random.py

Anyway, that version of python creates a global random.Random() object and assigns it directly to the random module. This object contains a seed(a) method which acts as a module function when you call random.seed(a). Thus the seed state is shared across your entire program.

1) Yes. moduleA and moduleB uses the same seed. Importing random in moduleA creates the global random.Random() object. Reimporting it in moduleB just gives you the same module and maintains the originally created random.Random() object.

2) No. Not in the example you gave, but in general yes it can matter. You might use moduleB before you set the seed in moduleA thus your seed wasn't set.

3) Hard to tell. Much more complicated code base. That said, I would think it works the same way. The authors of numpy would really have to try to make it work in a different way than how it works in the python implementation.

In general, if you are worried about seed state, I recommend creating your own random objects and pass them around for generating random numbers.



回答2:

So, for answering your first question:

Does moduleB also use my_seed, or do I have to pass the seed to moduleB.py and set it again?

Yes, it does, For example, ran the following:

ModuleA:

import moduleb
import random 
random.seed(0)
my_random()

ModuleB

import random
def my_random():
    print(random.randint(0,5))

This will always print 3, as the seed is set. The general rule is that the main python module that has to be run should call the random.seed() function and this creates a seed that is shared among all the imported modules. This is only changed if you explicitly call random.seed again from some other module.

For question 2:

Does the order of setting the random seed / importing play any role?

No it doesn't. Unless you call the random function before setting seed.

For question 3:

Is this also the case for setting numpy random seeds, e.g. np.random.seed(42)?

So, the issue that comes with using np.random.seed() is that they are not thread safe and that's why they don't behave similarly. More details can be found at: This Stackoverflow answer



回答3:

In jupyter notebook, random.seed seems to have cell scope. For instance, random.seed(1) is needed to be specified in both two consecutive cells to get the same result with the following code:

Cell 1:

np.random.seed(1)
np.random.random_sample(4)

Cell 2:

np.random.seed(1)
np.random.random(4)