A while back I wrote a random string generator that builds a string using the mt_rand()th character in a string until the desired length is reached.
public function getPassword ()
{
if ($this -> password == '')
{
$pw = '';
$charListEnd = strlen (static::CHARLIST) - 1;
for ($loops = mt_rand ($this -> min, $this -> max); $loops > 0; $loops--)
{
$pw .= substr (static::CHARLIST, mt_rand (0, $charListEnd), 1);
}
$this -> password = $pw;
}
return $this -> password;
}
(CHARLIST is a class constant containing a pool of characters for the password. $min and $max are length contraints)
Today, when researching something else entirely I stumbled upon the following code:
function generateRandomString ($length = 10) {
return substr(str_shuffle ("0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"), 0, $length);
}
This accomplishes pretty much the same effect as my looping mt_rand() based code in one line. I really like it for that simple reason, fewer lines of code is always a good thing. :)
But when I looked up str_shuffle in PHP's manual the documentation on it was pretty light. One thing I was really keen to learn was what algorithm does it use for randomness? The manual doesn't mention what kind of randomization is done to get the shuffled string. If it uses rand() instead of mt_rand() then sticking to my current solution may be better after all.
So basically I'd like to know how str_shuffle randomizes the string. Is it using rand() or mt_rand()? I'm using my random string function to generate passwords, so the quality of the randomness matters.
UPDATE: As has been pointed out, the str_shuffle method is not equivalent to the code I'm already using and will be less random due to the string's characters remaining the same as the input, only with their order changed. However I'm still curious as to how the str_shuffle function randomizes its input string.
A better solution would be mt_rand
which uses Mersenne Twister which much more better.
As has been pointed out, the str_shuffle method is not equivalent to the code I'm already using and will be less random due to the string's characters remaining the same as the input, only with their order changed. However I'm still curious as to how the str_shuffle function randomizes its input string.
To make the output equal lets just use 0,1
and look at the visual representation of each of the functions
Simple Test Code
header("Content-type: image/png");
$im = imagecreatetruecolor(512, 512) or die("Cannot Initialize new GD image stream");
$white = imagecolorallocate($im, 255, 255, 255);
for($y = 0; $y < 512; $y ++) {
for($x = 0; $x < 512; $x ++) {
if (testMTRand()) { //change each function here
imagesetpixel($im, $x, $y, $white);
}
}
}
imagepng($im);
imagedestroy($im);
function testMTRand() {
return mt_rand(0, 1);
}
function testRand() {
return rand(0, 1);
}
function testShuffle() {
return substr(str_shuffle("01"), 0, 1);
}
Output testRand()
Output testShuffle()
Output testMTRand()
So basically I'd like to know how str_shuffle randomizes the string. Is it using rand() or mt_rand()? I'm using my random string function to generate passwords, so the quality of the randomness matters.
You can see clearly that str_shuffle
produces almost same output as rand
...
Please be aware that this method should not be used if your application is really focused on security. The Mersenne Twister is NOT cryptographically secure. A PRNG can yield values which statistically appear to be random but still are easy to break.
Still not cryptographically secure, but here is a way to use str_shuffle()
while allowing character repetition, thereby improving complexity...
generate_password($length = 8, $strength = 3) {
if ($length < 6) $length = 6;
if ($length > 32) $length = 32;
// Excludes [0,O,o,1,I,i,L,l,1] on purpose for readability
$chars = 'abcdefghjkmnpqrstuvwxyz';
if ($strength >= 2) $chars .= '23456789';
if ($strength >= 3) $chars .= strtoupper($lower);
if ($strength >= 4) $chars .= '!@#$%&?';
return substr(str_shuffle(str_repeat($chars, $length)), 0, $length);
}
$chars
is repeated $length
times before the string is shuffled to make this a little better than shuffling only single occurrence.
We only use this in systems that do not store sensitive information ;)