PHP: get remote file size with strlen? (html)

2019-02-19 11:09发布

问题:

I was looking at PHP docs for fsockopen and whatnot and they say you can't use filesize() on a remote file without doing some crazy things with ftell or something (not sure what they said exactly), but I had a good thought about how to do it:

$file = file_get_contents("http://www.google.com");
$filesize = mb_strlen($file) / 1000; //KBs, mb_* in case file contains unicode

Would this be a good method? It seemed so simple and good to use at the time, just want to get any thoughts if this could run into problems or not be the true file size.

I only wish to use this on text (websites) by the way not binary.

回答1:

This answer requires PHP5 and cUrl. It first checks the headers. If Content-Length isn't specified, it uses cUrl to download it and check the size (the file is not saved anywhere though--just temporarily in memory).

<?php
echo get_remote_size("http://www.google.com/");

function get_remote_size($url) {
    $headers = get_headers($url, 1);
    if (isset($headers['Content-Length'])) return $headers['Content-Length'];
    if (isset($headers['Content-length'])) return $headers['Content-length'];

    $c = curl_init();
    curl_setopt_array($c, array(
        CURLOPT_URL => $url,
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_HTTPHEADER => array('User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1.3) Gecko/20090824 Firefox/3.5.3'),
        ));
    curl_exec($c);
    return curl_getinfo($c, CURLINFO_SIZE_DOWNLOAD);
}
?>


回答2:

You should look at the get_headers() function. It will return a hash of HTTP headers from an HTTP request. The Content-length header may be a better judge of the size of the actual content, if it's present.

That being said, you really should use either curl or streams to do a HEAD request instead of a GET. Content-length should be present, which saves you the transfer. It will be both faster and more accurate.



回答3:

it will fetch the whole file and then calculate the filesize (rather the string length) out of the retrieved data. usually filesize can tell the filesize directly from the filesystem without reading the whole file first.

so this will be rather slow, and will everytime fetch the whole file before being able to retrieve the filesize (string length