Can someone provide an explanation as to how modern programming languages (java, c#, python, javascript) cope with the limitations of randomness and where those limitations (time-based seeds for example) originate. I.e if they are imposed by the underlying operating systems and intel based hardware.
Basically i'd like to understand why there is no such thing as a truly random number without the appropriate hardware.
I'm going to answer the second part of your question first:
Basically I'd like to understand why there is no such thing as a truly random number without the appropriate hardware.
You can't generate truly random numbers on a computer without special hardware because computers are deterministic machines. What this means is that, given some initial state and an operation to perform, you can predict exactly how the machine will evolve. For instance, if you know that, on some hypothetical architecture, that register %d0
contains 24
and register %d1
contains 42
, and you know that the next instruction in the instruction stream is add %d0 %d1 %d2
, you then know that, after that instruction is executed, %d2
will contain 66
. In a higher-level language, you know that writing x = 1; y = 2; z = x + y
will result in z
being 3
with certainty.
This makes sense; we don't want to wonder what an addition will do, we want it to add. However, this is incompatible with generating truly random numbers. For a number to be truly random, there needs to be absolutely no way to predict it, no matter what you know. Certain quantum-mechanical processes have this behavior precisely, and other natural processes are close enough to random that, for all practical purposes, they are (for instance, if they look random and predicting them would require knowing the state of every molecule in the atmosphere). However, computers cannot do this, because the whole point of having a computer is to have a machine which deterministically executes code. You need to be able to predict what will happen when you run programs, else what's the point?
In a comment to Milan Ramaiya's answer, you said
I agree with [yo]u but still missing the most important thing - why cant computers produce a random number with pre-determined input?
The answer falls out directly from the definition of a truly random number. Since a truly random number needs to be completely unpredictable, it can never depend on deterministic input. If you have an algorithm which takes pre-determined input and uses it to produce a pseudo-random number, you can duplicate this process at will just as long as you know the input and algorithm.
You also asked
Can someone provide an explanation as to how modern programming languages … cope with the limitations of randomness and where those limitations … originate.
Well, as mentioned above, the limitations are inherent to the deterministic design of our languages and machines, which are there for good reasons (so that said languages and machines are usable :-) ). Assuming you aren't calling out to something which does have access to truly random numbers (such as /dev/random
on systems where it exists), the approach taken is to use a pseudo-random number generator. These algorithms are designed to produce a statistically random output sequence—one which, in a formal sense, looks unpredictable. I don't know enough statistics to explain or understand the details of this, but I believe the idea is that there are certain numeric tests you can run to tell how well your data predicts itself (in some loose sense) and things like that. However, the important point is that, while the sequence is deterministic, it "looks random". For many purposes, this is enough! And sometimes it has advantages: if you want to test code, for instance, it can be nice to be able to specify a seed and always have it receive the same sequence of pseudo-random numbers.
In summary, the overall answer to your question is this: Because we want to be able to predict what computers do, they can't generate unpredictable numbers (without special hardware). Programming languages aren't generally too impacted by this, because pseudo-random number generators are sufficient for most cases.
Software by design is deterministic. So the way random numbers are typically generated is by using a formula that spits data in statistically random order. This way, any program that needs a uniform distribution of numbers could set a seed based on some physical data (ie: timestamp) and get what will look like a random set of numbers. However, given a specific set of inputs, software will always perform in the same manner.
To have true random, there needs to be input which is nondeterministic.
Quoting Wikipedia,
To generate truly random numbers
requires precise, accurate, and
repeatable system measurements of
absolutely non-deterministic
processes. The open source operating
system Linux uses, for example,
various system timings (like user
keystrokes, I/O, or least-significant
digit voltage measurements) to produce
a pool of random numbers. It attempts
to constantly replenish the pool,
depending on the level of importance,
and so will issue a random number.
This system is an example, and similar
to those of dedicated hardware random
number generators.
Systems are designed to be predictable and discrete, nobody wants chaotic computers in order to people can programme them.
Predictable systems can't produce truly random numbers, only predictable numbers.
Computers generate random numbers by taken them from a long list of pre-generated values. Using a seed value helps to create different results every time the program is run, but isn't a fix-all because the list is fixed - it only changes the start position within that list. Computers are, obviously, very rigid in how they do things in that they can't do something truly random due to the limitations of how they are made. Sites like random.org create random numbers from external sources like radio noise. Maybe computers should take the noise from the power supply and use that as a truly random base? :-P
Software random numbers has two basic steps:
- generate a pseudo random number
- manipulate this pseudo to obtain a number in a range more useful (0 to 1, 1 to 100, etc.)
A common problem in software random number generators is that always has loops.
These loops are composed of a fixed set of numbers (the algorithm can't generate other numbers)
If algorithm is good that loop implies a very very big set of numbers
But if the algorithm is bad numbers set may be insufficient
These generated numbers are processed to obtain numbers only in 1 to 100 or 0 to 1 (for example) in order to they are useful to your program.
As the original algorithm isn't able to generate all numbers in a range, resulting set will get some numbers more often than others.