Get uniq random lines from file and write them to

2019-07-16 02:46发布

问题:

I have file consists of 10000 different lines. I need to take 100 random uniq lines from this file and write them to another file. What is the easiest way to do it using php?

回答1:

A naive way:

$lines = file('somefile.txt');
shuffle($lines);
$random_lines = array_slice($lines, 0, 10);

Note: This completely disregards system resource considerations.



回答2:

A faster solution larger lines

function m1($file) {
    $fp = fopen($file, "r");
    $size = filesize($file);
    $list = array();
    $n = 0;
    while ( true ) {
        fseek($fp, mt_rand(0, $size));
        fgets($fp);
        $pos = ftell($fp);
        isset($list[$pos]) or $s = trim(fgets($fp)) and $list[$pos] = $s and $n ++;
        if ($n >= 100)
            break;
    }
    return $list;
}



function m2($file) {
    $lines = file($file);
    shuffle($lines);
    $list = array_slice($lines, 0, 100);
    return $list;
}

Simple Benchmark with Accepted Solution

10,000 Lines

Array
(
    [m1] => 0.013591051101685 <------ M1 Faster
    [m2] => 0.033689975738525
)

100,000 Line

Array
(
    [m1] => 0.014040946960449 <------ M1 Faster
    [m2] => 0.094476938247681
)

Full Benchmark Code

File Used



标签: php random lines