PHP has its own function to work with gzip archives. I wrote the following code:
error_reporting(E_ALL);
$f = file_get_contents('http://spiderbites.nytimes.com/sitemaps/www.nytimes.com/sitemap.xml.gz');
echo $f;
$f = gzuncompress($f);
echo "<hr>";
echo $f;
First echo normally outputs the compressed file with proper header (at least first two bytes are correct). If I'd download this file with my browser I can unzip it easily.
However gzuncompress thrown Warning: gzuncompress(): data error in /home/path/to/script.php on line 5
Can anyone point me to the right direction to solve this problem?
EDIT:
The part of phpinfo() output
Or you could just use the right decompression function, gzdecode()
.
Note that gzuncompress() may not decompress some compressed strings and return a Data Error.
The problem could be that the outside compressed string has a CRC32 checksum at the end of the file instead of Adler-32, like PHP expects.
(http://php.net/manual/en/function.gzuncompress.php#79042)
That could be an option of why it does not work.
Try with his code:
function gzuncompress_crc32($data) {
$f = tempnam('/tmp', 'gz_fix');
file_put_contents($f, "\x1f\x8b\x08\x00\x00\x00\x00\x00" . $data);
return file_get_contents('compress.zlib://' . $f);
}
Modify your code in this:
error_reporting(E_ALL);
$f = file_get_contents('http://spiderbites.nytimes.com/sitemaps/www.nytimes.com/sitemap.xml.gz');
echo $f;
$f = gzuncompress_crc32($f);
echo "<hr>";
echo $f;
As far as I have tested locally, it does not give the error anymore.