I'd like to prove that a GUID is not unique in a simple test program. I expected the following code to run for hours, but it's not working. How can I make it work?
BigInteger begin = new BigInteger((long)0);
BigInteger end = new BigInteger("340282366920938463463374607431768211456",10); //2^128
for(begin; begin<end; begin++)
Console.WriteLine(System.Guid.NewGuid().ToString());
I'm using C#.
I don't understand why no one has mentioned upgrading your graphics card... Surely if you got a high-end NVIDIA Quadro FX 4800 or something (192 CUDA cores) this would go faster...
Of course if you could afford a few NVIDIA Qadro Plex 2200 S4s (at 960 CUDA cores each), this calculation would really scream. Perhaps NVIDIA would be willing to loan you a few for a "Technology Demonstration" as a PR stunt?
Surely they'd want to be part of this historic calculation...
The program, albeit its errors, shows proof that a GUID is not unique. Those that try to prove the contrary are missing the point. This statement just proves the weak implementation of some of the GUID variations.
A GUID is not necessary unique by definition, it is highly unique by definition. You just refined the meaning of highly. Depending on the version, the implementator (MS or others), use of VM's, etc your definition of highly changes. (see link in earlier post)
You can shorten your 128 bit table to prove your point. The best solution is to use a hash formula to shorten your table with duplicates, and then use the full value once the hash collides and based on that re-generate a GUID. If running from different locations, you would be storing your hash/full key pairs in a central location.
Ps: If the goal is just to generate x number of different values, create a hash table of this width and just check on the hash value.
This will run for a lot more than hours. Assuming it loops at 1 GHz (which it won't - it will be a lot slower than that), it will run for 10790283070806014188970 years. Which is about 83 billion times longer than the age of the universe.
Assuming Moores law holds, it would be a lot quicker to not run this program, wait several hundred years and run it on a computer that is billions of times faster. In fact, any program that takes longer to run than it takes CPU speeds to double (about 18 months) will complete sooner if you wait until the CPU speeds have increased and buy a new CPU before running it (unless you write it so that it can be suspended and resumed on new hardware).
You can show that in O(1) time with a variant of the quantum bogosort algorithm.
Presumably you have reason to be believe that the algorithm for producing Guids is not producing truly random numbers, but is in fact cycling with a period << 2^128.
e.g. RFC4122 method used to derive GUIDs which fixes the values of some bits.
Proof of cycling is going to depend upon the possible size of the period.
For small periods, hash table of hash(GUID) -> GUID with replacement on collision if GUIDs do not match (terminate if they do) might be an approach. Consider also only doing the replacement a random fraction of the time.
Ultimately if the maximum period between collisions is large enough (and isn't known in advance) any method is only going to yield a probability that the collision would be found if it existed.
Note that if the method of generating Guids is clock based (see the RFC), then it may not be possible to determine if collisions exist because either (a) you won't be able to wait long enough for the clock to wrap round, or (b) you can't request enough Guids within a clock tick to force a collision.
Alternatively you might be able to show a statistical relationship between the bits in the Guid, or a correlation of bits between Guids. Such a relationship might make it highly probable that the algorithm is flawed without necessarily being able to find an actual collision.
Of course, if you just want to prove that Guids can collide, then a mathematical proof, not a program, is the answer.
If the number of UUID being generated follows Moore's law, the impression of never running out of GUID in the foreseeable future is false.
With 2 ^ 128 UUIDs, it will only take 18 months * Log2(2^128) ~= 192 years, before we run out of all UUIDs.
And I believe (with no statistical proof what-so-ever) in the past few years since mass adoption of UUID, the speed we are generating UUID is increasing way faster than Moore's law dictates. In other words, we probably have less than 192 years until we have to deal with UUID crisis, that's a lot sooner than end of universe.
But since we definitely won't be running them out by the end of 2012, we'll leave it to other species to worry about the problem.