str_shuffle and randomness

2020-01-28 05:58发布

问题:

A while back I wrote a random string generator that builds a string using the mt_rand()th character in a string until the desired length is reached.

public function getPassword ()
{
    if ($this -> password == '')
    {
        $pw             = '';
        $charListEnd    = strlen (static::CHARLIST) - 1;
        for ($loops = mt_rand ($this -> min, $this -> max); $loops > 0; $loops--)
        {
            $pw .= substr (static::CHARLIST, mt_rand (0, $charListEnd), 1);
        }
        $this -> password   = $pw;
    }
    return $this -> password;
}

(CHARLIST is a class constant containing a pool of characters for the password. $min and $max are length contraints)

Today, when researching something else entirely I stumbled upon the following code:

function generateRandomString ($length = 10) {    
    return substr(str_shuffle ("0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"), 0, $length);
}

This accomplishes pretty much the same effect as my looping mt_rand() based code in one line. I really like it for that simple reason, fewer lines of code is always a good thing. :)

But when I looked up str_shuffle in PHP's manual the documentation on it was pretty light. One thing I was really keen to learn was what algorithm does it use for randomness? The manual doesn't mention what kind of randomization is done to get the shuffled string. If it uses rand() instead of mt_rand() then sticking to my current solution may be better after all.

So basically I'd like to know how str_shuffle randomizes the string. Is it using rand() or mt_rand()? I'm using my random string function to generate passwords, so the quality of the randomness matters.

UPDATE: As has been pointed out, the str_shuffle method is not equivalent to the code I'm already using and will be less random due to the string's characters remaining the same as the input, only with their order changed. However I'm still curious as to how the str_shuffle function randomizes its input string.

回答1:

A better solution would be mt_rand which uses Mersenne Twister which much more better.

As has been pointed out, the str_shuffle method is not equivalent to the code I'm already using and will be less random due to the string's characters remaining the same as the input, only with their order changed. However I'm still curious as to how the str_shuffle function randomizes its input string.

To make the output equal lets just use 0,1 and look at the visual representation of each of the functions

Simple Test Code

header("Content-type: image/png");
$im = imagecreatetruecolor(512, 512) or die("Cannot Initialize new GD image stream");
$white = imagecolorallocate($im, 255, 255, 255);
for($y = 0; $y < 512; $y ++) {
    for($x = 0; $x < 512; $x ++) {
        if (testMTRand()) { //change each function here 
            imagesetpixel($im, $x, $y, $white);
        }
    }
}
imagepng($im);
imagedestroy($im);

function testMTRand() {
    return mt_rand(0, 1);
}

function testRand() {
    return rand(0, 1);
}

function testShuffle() {
    return substr(str_shuffle("01"), 0, 1);
}

Output testRand()

Output testShuffle()

Output testMTRand()

So basically I'd like to know how str_shuffle randomizes the string. Is it using rand() or mt_rand()? I'm using my random string function to generate passwords, so the quality of the randomness matters.

You can see clearly that str_shuffle produces almost same output as rand ...



回答2:

Please be aware that this method should not be used if your application is really focused on security. The Mersenne Twister is NOT cryptographically secure. A PRNG can yield values which statistically appear to be random but still are easy to break.



回答3:

Still not cryptographically secure, but here is a way to use str_shuffle() while allowing character repetition, thereby improving complexity...

generate_password($length = 8, $strength = 3) {
    if ($length < 6) $length = 6;
    if ($length > 32) $length = 32;
    // Excludes [0,O,o,1,I,i,L,l,1] on purpose for readability
    $chars = 'abcdefghjkmnpqrstuvwxyz';
    if ($strength >= 2) $chars .= '23456789';
    if ($strength >= 3) $chars .= strtoupper($lower);
    if ($strength >= 4) $chars .= '!@#$%&?';
    return substr(str_shuffle(str_repeat($chars, $length)), 0, $length);
}

$chars is repeated $length times before the string is shuffled to make this a little better than shuffling only single occurrence.

We only use this in systems that do not store sensitive information ;)