Unique key generation

2019-01-11 14:38发布

问题:

I looking for a way, specifically in PHP that I will be guaranteed to always get a unique key.

I have done the following:

strtolower(substr(crypt(time()), 0, 7));

But I have found that once in a while I end up with a duplicate key (rarely, but often enough).

I have also thought of doing:

strtolower(substr(crypt(uniqid(rand(), true)), 0, 7));

But according to the PHP website, uniqid() could, if uniqid() is called twice in the same microsecond, it could generate the same key. I'm thinking that the addition of rand() that it rarely would, but still possible.

After the lines mentioned above I am also remove characters such as L and O so it's less confusing for the user. This maybe part of the cause for the duplicates, but still necessary.

One option I have a thought of is creating a website that will generate the key, storing it in a database, ensuring it's completely unique.

Any other thoughts? Are there any websites out there that already do this that have some kind of API or just return the key. I found http://userident.com but I'm not sure if the keys will be completely unique.

This needs to run in the background without any user input.

回答1:

There are only 3 ways to generate unique values, rather they be passwords, user IDs, etc.:

  1. Use an effective GUID generator - these are long and cannot be shrunk. If you only use part you FAIL.
  2. At least part of the number is sequentially generated off of a single sequence. You can add fluff or encoding to make it look less sequential. Advantage is they start short - disadvantage is they require a single source. The work around for the single source limitation is to have numbered sources, so you include the [source #] + [seq #] and then each source can generate its own sequence.
  3. Generate them via some other means and then check them against the single history of previously generated values.

Any other method is not guaranteed. Keep in mind, fundamentally you are generating a binary number (it is a computer), but then you can encode it in Hexadecimal, Decimal, Base64, or a word list. Pick an encoding that fits your usage. Usually for user entered data you want some variation of Base32 (which you hinted at).

Note about GUIDS: They gain their strength of uniqueness from their length and the method used to generate them. Anything less than 128-bits is not secure. Beyond random number generation there are characteristics that go into a GUID to make it more unique. Keep in mind they are only practically unique, not completely unique. It is possible, although practically impossible to have a duplicate.

Updated Note about GUIDS: Since writing this I learned that many GUID generators use a cryptographically secure random number generator (difficult or impossible to predict the next number generated, and a not likely to repeat). There are actually 5 different UUID algorithms. Algorithm 4 is what Microsoft currently uses for the Windows GUID generation API. A GUID is Microsoft's implementation of the UUID standard.

Update: If you want 7 to 16 characters then you need to use either method 2 or 3.

Bottom line: Frankly there is no such thing as completely unique. Even if you went with a sequential generator you would eventually run out of storage using all the atoms in the universe, thus looping back on yourself and repeating. Your only hope would be the heat death of the universe before reaching that point.

Even the best random number generator has a possibility of repeating equal to the total size of the random number you are generating. Take a quarter for example. It is a completely random bit generator, and its odds of repeating are 1 in 2.

So it all comes down to your threshold of uniqueness. You can have 100% uniqueness in 8 digits for 1,099,511,627,776 numbers by using a sequence and then base32 encoding it. Any other method that does not involve checking against a list of past numbers only has odds equal to n/1,099,511,627,776 (where n=number of previous numbers generated) of not being unique.



回答2:

Any algorithm will result in duplicates.

Therefore, might I suggest that you use your existing algorithm* and simply check for duplicates?

*Slight addition: If uniqid() can be non-unique based on time, also include a global counter that you increment after every invocation. That way something is different even in the same microsecond.



回答3:

Without writing the code, my logic would be:

Generate a random string from whatever acceptable characters you like.
Then add half the date stamp (partial seconds and all) to the front and the other half to the end (or somewhere in the middle if you prefer).

Stay JOLLY!
H



回答4:

If you use your original method, but add the username or emailaddress in front of the password, it will always be unique if each user only can have 1 password.



回答5:

You may be interested in this article which deals with the same issue: GUIDs are globally unique, but substrings of GUIDs aren't.

The goal of this algorithm is to use the combination of time and location ("space-time coordinates" for the relativity geeks out there) as the uniqueness key. However, timekeeping is not perfect, so there's a possibility that, for example, two GUIDs are generated in rapid succession from the same machine, so close to each other in time that the timestamp would be the same. That's where the uniquifier comes in.



回答6:

I usually do it like this:

$this->password = '';

for($i=0; $i<10; $i++)
{
    if($i%2 == 0)
        $this->password .= chr(rand(65,90));
    if($i%3 == 0)
        $this->password .= chr(rand(97,122));
    if($i%4 == 0)
        $this->password .= chr(rand(48,57));
}

I suppose there are some theoretical holes but I've never had an issue with duplication. I usually use it for temporary passwords (like after a password reset) and it works well enough for that.



回答7:

As Frank Kreuger commented, go with a GUID generator.

Like this one



回答8:

I'm still not seeing why the passwords have to be unique? What's the downside if 2 of your users have the same password?

This is assuming we're talking about passwords that are tied to userids, and not just unique identifiers. If that's what you're looking for, why not use GUIDs?



回答9:

You might be interested in Steve Gibson's over-the-top-secure implementation of a password generator (no source, but he has a detailed description of how it works) at https://www.grc.com/passwords.htm.

The site creates huge 64-character passwords but, since they're completely random, you could easily take the first 8 (or however many) characters for a less secure but "as random as possible" password.

EDIT: from your later answers I see you need something more like a GUID than a password, so this probably isn't what you want...



回答10:

I do believe that part of your issue is that you are trying to us a singular function for two separate uses... passwords and transaction_id

these really are two different problem areas and it really is not best to try to address them together.



回答11:

I recently wanted a quick and simple random unique key so I did the following:

$ukey = dechex(time()) . crypt( time() . md5(microtime() + mt_rand(0, 100000)) ); 

So, basically, I get the unix time in seconds and add a random md5 string generated from time + random number. It's not the best, but for low frequency requests it is pretty good. It's fast and works.

I did a test where I'd generate thousands of keys and then look for repeats, and having about 800 keys per second there were no repetitions, so not bad. I guess it totally depends on mt_rand()

I use it for a survey tracker where we get a submission rate of about 1000 surveys per minute... so for now (crosses fingers) there are no duplicates. Of course, the rate is not constant (we get the submissions at certain times of the day) so this is not fail proof nor the best solution... the tip is using an incremental value as part of the key (in my case, I used time(), but could be better).



回答12:

Ingoring the crypting part that does not have much to do with creating a unique value I usually use this one:

function GetUniqueValue()
{
   static $counter = 0; //initalized only 1st time function is called
   return strtr(microtime(), array('.' => '', ' ' => '')) . $counter++;
}

When called in same process $counter is increased so value is always unique in same process.

When called in different processes you must be really unlucky to get 2 microtime() call with the same values, think that microtime() calls usually have different values also when called in same script.



回答13:

I usually do a random substring (randomize how many chars between 8 an 32, or less for user convenience) or the MD5 of some value I have gotten in, or the time, or some combination. For more randomness I do MD5 of come value (say last name) concatenate that with the time, MD5 it again, then take the random substring. Yes, you could get equal passwords, but its not very likely at all.