Why PHP's gzuncompress() function can go wrong

2019-02-25 10:40发布

问题:

PHP has its own function to work with gzip archives. I wrote the following code:

error_reporting(E_ALL);
$f = file_get_contents('http://spiderbites.nytimes.com/sitemaps/www.nytimes.com/sitemap.xml.gz');
echo $f;
$f = gzuncompress($f);
echo "<hr>";
echo $f;

First echo normally outputs the compressed file with proper header (at least first two bytes are correct). If I'd download this file with my browser I can unzip it easily.

However gzuncompress thrown Warning: gzuncompress(): data error in /home/path/to/script.php on line 5

Can anyone point me to the right direction to solve this problem?

EDIT:

The part of phpinfo() output

回答1:

Or you could just use the right decompression function, gzdecode().



回答2:

Note that gzuncompress() may not decompress some compressed strings and return a Data Error.

The problem could be that the outside compressed string has a CRC32 checksum at the end of the file instead of Adler-32, like PHP expects.

(http://php.net/manual/en/function.gzuncompress.php#79042)

That could be an option of why it does not work.

Try with his code:

function gzuncompress_crc32($data) {
     $f = tempnam('/tmp', 'gz_fix');
     file_put_contents($f, "\x1f\x8b\x08\x00\x00\x00\x00\x00" . $data);
     return file_get_contents('compress.zlib://' . $f);
}

Modify your code in this:

error_reporting(E_ALL);
$f = file_get_contents('http://spiderbites.nytimes.com/sitemaps/www.nytimes.com/sitemap.xml.gz');
echo $f;
$f = gzuncompress_crc32($f);
echo "<hr>";
echo $f;

As far as I have tested locally, it does not give the error anymore.



标签: php gzip unzip