Efficiently counting the number of lines of a text

2019-01-03 12:51发布

I have just found out that my script gives me a fatal error:

Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 440 bytes) in C:\process_txt.php on line 109

That line is this:

$lines = count(file($path)) - 1;

So I think it is having difficulty loading the file into memeory and counting the number of lines, is there a more efficient way I can do this without having memory issues?

The text files that I need to count the number of lines for range from 2MB to 500MB. Maybe a Gig sometimes.

Thanks all for any help.

16条回答
聊天终结者
2楼-- · 2019-01-03 13:15

If you're running this on a Linux/Unix host, the easiest solution would be to use exec() or similar to run the command wc -l $path. Just make sure you've sanitized $path first to be sure that it isn't something like "/path/to/file ; rm -rf /".

查看更多
smile是对你的礼貌
3楼-- · 2019-01-03 13:17
private static function lineCount($file) {
    $linecount = 0;
    $handle = fopen($file, "r");
    while(!feof($handle)){
        if (fgets($handle) !== false) {
                $linecount++;
        }
    }
    fclose($handle);
    return  $linecount;     
}

I wanted to add a little fix to the function above...

in a specific example where i had a file containing the word 'testing' the function returned 2 as a result. so i needed to add a check if fgets returned false or not :)

have fun :)

查看更多
劫难
4楼-- · 2019-01-03 13:23

You have several options. The first is to increase the availble memory allowed, which is probably not the best way to do things given that you state the file can get very large. The other way is to use fgets to read the file line by line and increment a counter, which should not cause any memory issues at all as only the current line is in memory at any one time.

查看更多
ら.Afraid
5楼-- · 2019-01-03 13:26

Counting the number of lines can be done by following codes:

<?php
$fp= fopen("myfile.txt", "r");
$count=0;
while($line = fgetss($fp)) // fgetss() is used to get a line from a file ignoring html tags
$count++;
echo "Total number of lines  are ".$count;
fclose($fp);
?>
查看更多
霸刀☆藐视天下
6楼-- · 2019-01-03 13:29

If you're under linux you can simply do:

number_of_lines = intval(trim(shell_exec("wc -l ".$file_name." | awk '{print $1}'")));

You just have to find the right command if you're using another OS

Regards

查看更多
欢心
7楼-- · 2019-01-03 13:29

There is another answer that I thought might be a good addition to this list.

If you have perl installed and are able to run things from the shell in PHP:

$lines = exec('perl -pe \'s/\r\n|\n|\r/\n/g\' ' . escapeshellarg('largetextfile.txt') . ' | wc -l');

This should handle most line breaks whether from Unix or Windows created files.

TWO downsides (at least):

1) It is not a great idea to have your script so dependent upon the system its running on ( it may not be safe to assume Perl and wc are available )

2) Just a small mistake in escaping and you have handed over access to a shell on your machine.

As with most things I know (or think I know) about coding, I got this info from somewhere else:

John Reeve Article

查看更多
登录 后发表回答