How unique is UUID?

2018-12-31 12:29发布

How safe is it to use UUID to uniquely identify something (I'm using it for files uploaded to the server)? As I understand it, it is based off random numbers. However, it seems to me that given enough time, it would eventually repeat it self, just by pure chance. Is there a better system or a pattern of some type to alleviate this issue?

10条回答
妖精总统
2楼-- · 2018-12-31 13:27

I concur with the other answers. UUIDs are safe enough for nearly all practical purposes1, and certainly for yours.

But suppose (hypothetically) that they aren't.

Is there a better system or a pattern of some type to alleviate this issue?

Here are a couple of approaches:

  1. Use a bigger UUID. For instance, instead of a 128 random bits, use 256 or 512 or ... Each bit you add to a type-4 style UUID will reduce the probability of a collision by a half, assuming that you have a reliable source of entropy2.

  2. Build a centralized or distributed service that generates UUIDs and records each and every one it ever issues. Each time it generates a new one, it checks that the UUID has never been issued before. Such a service would be technically straight-forward to implement (I think) if we assumed that the people running the service were absolutely trustworthy, incorruptible, etcetera. Unfortunately, they aren't ... especially when there is the possibility of governments interfering. So, this approach is probably impractical, and may be3 impossible in the real world.


1 - If uniqueness of UUIDs determined whether nuclear missiles got launched at your country's capital city, a lot of your fellow citizens would not be convinced by "the probability is extremely low". Hence my "nearly all" qualification.

2 - And here's a philosophical question for you. Is anything ever truly random? How would we know if it wasn't? Is the universe as we know it a simulation? Is there a God who might conceivably "tweak" the laws of physics to alter an outcome?

3 - If anyone knows of any research papers on this problem, please comment.

查看更多
倾城一夜雪
3楼-- · 2018-12-31 13:28

The answer to this may depend largely on the UUID version.

Many UUID generators use a version 4 random number. However, many of these use Pseudo a Random Number Generator to generate them.

If a poorly seeded PRNG with a small period is used to generate the UUID I would say it's not very safe at all.

Therefore, it's only as safe as the algorithms used to generate it.

On the flip side, if you know the answer to these questions then I think a version 4 uuid should be very safe to use. In fact I'm using it to identify blocks on a network block file system and so far have not had a clash.

In my case, the PRNG I'm using is a mersenne twister and I'm being careful with the way it's seeded which is from multiple sources including /dev/urandom. Mersenne twister has a period of 2^19937 − 1. It's going to be a very very long time before I see a repeat uuid.

查看更多
步步皆殇っ
4楼-- · 2018-12-31 13:29

I don't know if this matters to you, but keep in mind that GUIDs are globally unique, but substrings of GUIDs aren't.

查看更多
永恒的永恒
5楼-- · 2018-12-31 13:34

If by "given enough time" you mean 100 years and you're creating them at a rate of a billion a second, then yes, you have a 50% chance of having a collision after 100 years.

查看更多
登录 后发表回答