Generate unique random alphanumeric characters tha

2019-01-17 13:11发布

问题:

It need not be meaningful words - more like random password generation, but the catch is - they should be unique. I will be using this for some kind of package / product code. Which is the best method available? :)

回答1:

It is generally not possible to generate sequences with both unique and random elements: obviously to be unique the algorithm has to take into account the previously generated elements in the sequence, so the next ones will not really be random.

Therefore your best bet would be to detect collisions and just retry (which could be very expensive in your particular case).

If you are constrained to just 7 chars, there's not much you can do above:

$allowed_chars = 'abcdefghijklmnopqrstuvwxz';
$allowed_count = strlen($allowed_chars);
$password = null;
$password_length = 7;

while($password === null || already_exists($password)) {
    $password = '';
    for($i = 0; $i < $password_length; ++$i) {
        $password .= $allowed_chars{mt_rand(0, $allowed_count - 1)};
    }
}

This should eventually give you a new password.

However, in similar cases I have encountered I usually pick a larger password size which also happens to be the size of the hex representation of a popular hash function (e.g. md5). Then you can make it easier on yourself and less error prone:

$password = time(); // even better if you have some other "random" input to use here

do {
    $password = md5(time().$password);
}
while (already_exists($password));

This also has the added advantage that the sequence space is larger, hence there will be less collisions. You can pick the size of the hash function according to the expected numbers of passwords you will generate in the future to "guarantee" a low collision probability and thus less calls to the possibly expensive already_exists function.



回答2:

Here's a way you could do it without hashes or loops:

$password = sprintf(
    "%04s%03s",
    base_convert(mt_rand(0, pow(36, 4) - 1), 10, 36),
    base_convert(mt_rand(0, pow(36, 3) - 1), 10, 36)
);

As a few others have mentioned, ensuring uniqueness is more complicated, and should be unneeded. The simplest way you could do it would be to add extra characters at the end, incrementing with each password generated.



回答3:

Here is something that looks random and should be unique and have 7 chars for the times to come:

echo base_convert(intval(microtime(true) * 10000), 10, 36);

Or for a little more randomness and less uniqueness (between 1000 and 10000 per second):

echo base_convert(mt_rand(1, 9) . intval(microtime(true) * 1000), 10, 36);

Or (uniqueness between 100 and 10000 per second) - this is probably the best option:

echo base_convert(mt_rand(10, 99) . intval(microtime(true) * 100), 10, 36);

Or (uniqueness between 10 and 10000 per second):

echo base_convert(mt_rand(100, 999) . intval(microtime(true) * 10), 10, 36);

You get the idea.



回答4:

A random alphanumeric (base 36 = 0..9 + a..z) value that has 7 chars has to have a base 10 representation between 2176782336 and 78364164095, the following snippet proves it:

var_dump(base_convert('1000000', 36, 10));                   //  2176782336
var_dump(base_convert('zzzzzzz', 36, 10));                   // 78364164095

In order for it to be unique we have to rely on a non-repeating factor, the obvious choice is time():

var_dump(time());                                            //  1273508728
var_dump(microtime(true));                                   //  1273508728.2883

If we only wanted to ensure a minimum uniqueness factor of 1 unique code per second we could do:

var_dump(base_convert(time() * 2, 10, 36));                  // 164ff8w
var_dump(base_convert(time() * 2 + 1, 10, 36));              // 164ff8x
var_dump(base_convert(time() * 2 + 2, 10, 36));              // 164ff8y
var_dump(base_convert(time() * 2 + 3, 10, 36));              // 164ff8z

You'll notice that these codes aren't random, you'll also notice that time() (1273508728) is less than 2176782336 (the minimum base 10 representation of a 7 char code), that's why I do time() * 2.

Now lets do some date math in order to add randomness and increase the uniqueness factor while complying with the integer limitations of older versions of PHP (< 5.0?):

var_dump(1 * 60 * 60);                                       //       3600
var_dump(1 * 60 * 60 * 24);                                  //      86400
var_dump(1 * 60 * 60 * 24 * 366);                            //   31622400
var_dump(1 * 60 * 60 * 24 * 366 * 10);                       //  316224000
var_dump(1 * 60 * 60 * 24 * 366 * 20);                       //  632448000
var_dump(1 * 60 * 60 * 24 * 366 * 30);                       //  948672000
var_dump(1 * 60 * 60 * 24 * 366 * 31);                       //  980294400
var_dump(PHP_INT_MAX);                                       // 2147483647

Regarding PHP_INT_MAX I'm not sure what exactly changed in recent versions of PHP because the following clearly works in PHP 5.3.1, maybe someone could shed some light into this:

var_dump(base_convert(PHP_INT_MAX, 10, 36));                 // zik0zj
var_dump(base_convert(PHP_INT_MAX + 1, 10, 36));             // zik0zk
var_dump(base_convert(PHP_INT_MAX + 2, 10, 36));             // zik0zl
var_dump(base_convert(PHP_INT_MAX * 2, 10, 36));             // 1z141z2
var_dump(base_convert(PHP_INT_MAX * 2 + 1, 10, 36));         // 1z141z3
var_dump(base_convert(PHP_INT_MAX * 2 + 2, 10, 36));         // 1z141z4

I got kinda lost with my rationalization here and I'm bored so I'll just finish really quick. We can use pretty much the whole base 36 charset and safely generate sequential codes with a minimum guaranteed uniqueness factor of 1 unique code per second for 3.16887646 years using this:

base_convert(mt_rand(22, 782) . substr(time(), 2), 10, 36);

I just realized that the above can sometimes return duplicated values due to the first argument of mt_rand(), in order to produce unique results we need to limit a our base 36 charset a little bit:

base_convert(mt_rand(122, 782) . substr(time(), 2), 10, 36);

Remember that the above values are still sequential, in order to make them look random we can use microtime() but we can only ensure a uniqueness factor of 10 codes per second for 3.8 months:

base_convert(mt_rand(122, 782) . substr(number_format(microtime(true), 1, '', ''), 3), 10, 36);

This proved to be more difficult than I originally antecipated since there are lot of constrains:

  • use the whole base 36 charset
  • generate random-looking codes
  • trade-offs between uniqueness factor per second and durability of uniqueness
  • PHP integer limitations

If we can ignore any of the above it would be a lot easier and I'm sure this can be further optimized but like I said: this is boring me. Maybe someone would like to pick this up where I left. =) I'm hungry! =S



回答5:

Given that you mention passwords here, I'll assume you need a secure method (ie: someone shouldn't be able to guess someone else's password based on knowing any other password). You could use the following:

  1. Decide on a master password, for example "MasterPassword"
  2. For each password generated, append to that either a random or sequential nonce, for example "MasterPassword1", "MasterPassword2".
  3. Perform a cryptographic hash on that (SHA, MD5, etc) and covert the hash to a hexadecimal representation, eg "ce7f181a44a4a5b7e43fe2b9a0b1f0c1".
  4. Truncate that to as many characters as you need - perhaps seven as you indicated: "ce7f181".
  5. Check if that has been assigned before. If not, return that as your password. Otherwise, repeat from 2.

If security is not an issue, steps 1 and 2 by themselves would be sufficient. If security is an issue, it is vital that no one but yourself knows the value of "MasterPassword".



回答6:

Here's how I would solve this problem:

Consider that the 7 characters can be one of 26 letters (abc..z), or 10 numbers (01...9). This makes 36 possible characters.

Each time your application generates a new code, have it increment a global variable. You can turn this unique number into a unique string by using a "Hexatridecimal" converter, and by adding filler characters to make up the rest of the string.

Take a look at this link. I think this guy had the same problem as you: http://www.codemaxima.com/2010/04/the-hexatridecimal-numbering-system/



回答7:

$random = substr(hash('md5',openssl_random_pseudo_bytes(32)),0,7);


回答8:

+1 to @Michael Haren's comment. If the passwords on your site should not have a constraint to be unique.

If I try to use a given password and I get an error that I can't use it because it's already in use, then I know some user on the system has that password. If there are 1000 users I only need to try a max of 1000 other accounts before I find the one who has that password.

Not really answering your question, but more than a comment. So I'm marking this CW.



回答9:

md5( microtime() );


回答10:

this is my favorite way to do it.

$pretrimmedrandom = md5(uniqid(mt_rand(),true));
$trimmed =  substr($pretrimmedrandom ,0,7);

uniqid uses the current time to generate a very unique random string. results look like "3f456yg".



回答11:

Galen's answer allows only one use of each character in the password. Not much information in that string. A simple change though:

$chars = 'abcdefghijklmnopqrstuvwxyz0123456789';
$passwordlength = 7;
for ($x = 1; $x <= $passwordlength; $x++) {
  $charlist .= $chars;
}
$temp_pw = substr( str_shuffle( $charlist ), 0, $passwordlength );


回答12:

substr(str_shuffle(md5(microtime())),rand(0,21),7);


回答13:

try this

echo $unique_key = substr(md5(rand(0, 1000000)), 0, 5);

it will give string with length 5.



回答14:

Use Kohana text,

http://docs.kohanaphp.com/helpers/text

For example,

   $prod_id = text::random('alpha', 7);

If you don't want use the framework, you can simply copy the code. You will find lots of goodies there.



回答15:

Heres a very simple way

$chars = 'abcdefghijklmnopqrstuvwxyz0123456789';
$temp_pw = substr( str_shuffle( $chars ), 0, 7 );
if ( check_unique( $temp_pw ) ) {
    $pw = $temp_pw;
}

You'll have to implement your own check_unique function. That part should be easy.