可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I have some code which delivers things based on weighted random. Things with more weight are more likely to be randomly chosen. Now being a good rubyist I of couse want to cover all this code with tests. And I want to test that things are getting fetched according the correct probabilities.
So how do I test this? Creating tests for something that should be random make it very hard to compare actual vs expected. A few ideas I have, and why they wont work great:
Stub Kernel.rand in my tests to return fixed values. This is cool, but rand() gets called multiple times and I'm not sure I can rig this with enough control to test what I need to.
Fetch a random item a HUGE number of times and compare the actual ratio vs the expected ratio. But unless I can run it an infinite number of times, this will never be perfect and could intermittently fail if I get some bad luck in the RNG.
Use a consistent random seed. This makes the RNG repeatable but it still doesn't give me any verification that item A will happen 80% of the time (for example).
So what kind of approach can I use to write test coverage for random probabilities?
回答1:
I think you should separate your goals. One is to stub Kernel.rand as you mention. With rspec for example, you can do something like this:
test_values = [1, 2, 3]
Kernel.stub!(:rand).and_return( *test_values )
Note that this stub won't work unless you call rand with Kernel as the receiver. If you just call "rand" then the current "self" will receive the message, and you'll actually get a random number instead of the test_values.
The second goal is to do something like a field test where you actually generate random numbers. You'd then use some kind of tolerance to ensure you get close to the desired percentage. This is never going to be perfect though, and will probably need a human to evaluate the results. But it still is useful to do because you might realize that another random number generator might be better, like reading from /dev/random. Also, it's good to have this kind of test because let's say you decide to migrate to a new kind of platform whose system libraries aren't as good at generating randomness, or there's some bug in a certain version. The test could be a warning sign.
It really depends on your goals. Do you only want to test your weighting algorithm, or also the randomness?
回答2:
It's best to stub Kernel.rand to return fixed values.
Kernel.rand is not your code. You should assume it works, rather than trying to write tests that test it rather than your code. And using a fixed set of values that you've chosen and explicitly coded in is better than adding a dependency on what rand produces for a specific seed.
回答3:
If you wanna go down the consistent seed route, look at Kernel#srand
:
http://www.ruby-doc.org/core/classes/Kernel.html#M001387
To quote the docs (emphasis added):
Seeds the pseudorandom number
generator to the value of number. If
number is omitted or zero, seeds the
generator using a combination of the
time, the process id, and a sequence
number. (This is also the behavior if
Kernel::rand is called without
previously calling srand, but without
the sequence.) By setting the seed
to a known value, scripts can be made
deterministic during testing. The
previous seed value is returned. Also
see Kernel::rand.
回答4:
For testing, stub Kernel.rand with the following simple but perfectly reasonable LCPRNG:
@@q = 0
def r
@@q = 1_103_515_245 * @@q + 12_345 & 0xffff_ffff
(@@q >> 2) / 0x3fff_ffff.to_f
end
You might want to skip the division and use the integer result directly if your code is compatible, as all bits of the result would then be repeatable instead of just "most of them". This isolates your test from "improvements" to Kernel.rand and should allow you to test your distribution curve.
回答5:
My suggestion: Combine #2 and #3. Set a random seed, then run your tests a very large number of times.
I do not like #1, because it means your test is super-tightly coupled to your implementation. If you change how you are using the output of rand(), the test will break, even if the result is correct. The point of a unit test is that you can refactor the method and rely on the test to verify that it still works.
Option #3, by itself, has the same problem as #1. If you change how you use rand(), you will get different results.
Option #2 is the only way to have a true black box solution that does not rely on knowing your internals. If you run it a sufficiently high number of times, the chance of random failure is negligible. (You can dig up a stats teacher to help you calculate "sufficiently high," or you can just pick a really big number.)
But if you're hyper-picky and "negligible" isn't good enough, a combination of #2 and #3 will ensure that once the test starts passing, it will keep passing. Even that negligible risk of failure only crops up when you touch the code under test; as long as you leave the code alone, you are guaranteed that the test will always work correctly.
回答6:
Pretty often when I need predictable results from something that is derived from a random number I usually want control of the RNG, which means that the easiest is to make it injectable. Although overriding/stubbing rand
can be done, Ruby provides a fine way to pass your code a RNG that is seeded with some value:
def compute_random_based_value(input_value, random: Random.new)
# ....
end
and then inject a Random object I make on the spot in the test, with a known seed:
rng = Random.new(782199) # Scientific dice roll
compute_random_based_value(your_input, random: rng)