Simple proof that GUID is not unique [closed]

2018-12-31 14:38发布

I'd like to prove that a GUID is not unique in a simple test program. I expected the following code to run for hours, but it's not working. How can I make it work?

BigInteger begin = new BigInteger((long)0);
BigInteger end = new BigInteger("340282366920938463463374607431768211456",10);  //2^128
for(begin; begin<end; begin++)
  Console.WriteLine(System.Guid.NewGuid().ToString());

I'm using C#.

标签: c# guid
30条回答
呛了眼睛熬了心
2楼-- · 2018-12-31 15:15
for(begin; begin<end; begin)
    Console.WriteLine(System.Guid.NewGuid().ToString());

You aren't incrementing begin so the condition begin < end is always true.

查看更多
裙下三千臣
3楼-- · 2018-12-31 15:15

GUIDs are 124 bits because 4 bits hold the version number.

查看更多
骚的不知所云
4楼-- · 2018-12-31 15:18

You could hash the GUIDs. That way, you should get a result much faster.

Oh, of course, running multiple threads at the same time is also a good idea, that way you'll increase the chance of a race condition generating the same GUID twice on different threads.

查看更多
春风洒进眼中
5楼-- · 2018-12-31 15:19

[Update:] As the comments below point out, newer MS GUIDs are V4 and do not use the MAC address as part of the GUID generation (I haven't seen any indication of a V5 implementation from MS though, so if anyone has a link confirming that let me know). WIth V4 though, time is still a factor though, and the odds against duplication of GUIDs remains so small as to be irrelevant for any practical usage. You certainly would not be likely to ever generate a duplicate GUID from just a single system test such as the OP was trying to do.

Most of these answers are missing one vital point about Microsoft's GUID implementation. The first part of the GUID is based on a timestamp and another part is based on the MAC address of the network card (or a random number if no NIC is installed).

If I understand this correctly, it means that the only reliable way to duplicate a GUID would be to run simultainous GUID generations on multiple machines where the MAC addresses were the same AND where the clocks on both systems were at the same exact time when the generation occured (the timestamp is based on milliseconds if I understand it correctly).... even then there are a lot of other bits in the number that are random, so the odds are still vanishingly small.

For all practical purposes the GUIDs are universally unique.

There is a pretty good description of the MS GUID over at "The Old New Thing" blog

查看更多
不再属于我。
6楼-- · 2018-12-31 15:19
  1. Go to the cryogenics lab in the New York City.
  2. Freeze yourself for (roughly) 1990 years.
  3. Get a job at Planet Express.
  4. Buy a brand-new CPU. Build a computer, run the program, and place it in the safe place with an pseudo-perpetual motion machine like the doomsday machine.
  5. Wait until the time machine is invented.
  6. Jump to the future using the time machine. If you bought 1YHz 128bit CPU, go to 3,938,453,320 days 20 hours 15 minutes 38 seconds 463 ms 463 μs 374 ns 607 ps after when you started to run the program.
  7. ...?
  8. PROFIT!!!

... It takes at least 10,783,127 years even if you had 1YHz CPU which is 1,000,000,000,000,000 (or 1,125,899,906,842,624 if you prefer to use binary prefix) times faster than 1GHz CPU.

So rather than waiting for the compute finished, it would be better to feed pigeons which lost their home because other n pigeons took their home. :(

Or, you can wait until 128-bit quantum computer is invented. Then you may prove that GUID is not unique, by using your program in reasonable time(maybe).

查看更多
旧时光的记忆
7楼-- · 2018-12-31 15:21

Of course GUIDs can collide. Since GUIDs are 128-bits, just generate 2^128 + 1 of them and by the pigeonhole principle there must be a collision.

But when we say that a GUID is a unique, what we really mean is that the key space is so large that it is practically impossible to accidentally generate the same GUID twice (assuming that we are generating GUIDs randomly).

If you generate a sequence of n GUIDs randomly, then the probability of at least one collision is approximately p(n) = 1 - exp(-n^2 / 2 * 2^128) (this is the birthday problem with the number of possible birthdays being 2^128).

   n     p(n)
2^30 1.69e-21
2^40 1.77e-15
2^50 1.86e-10
2^60 1.95e-03

To make these numbers concrete, 2^60 = 1.15e+18. So, if you generate one billion GUIDs per second, it will take you 36 years to generate 2^60 random GUIDs and even then the probability that you have a collision is still 1.95e-03. You're more likely to be murdered at some point in your life (4.76e-03) than you are to find a collision over the next 36 years. Good luck.

查看更多
登录 后发表回答