可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I have a gzip file and I am trying to read it via Python as below:

import zlib

do = zlib.decompressobj(16+zlib.MAX_WBITS)
fh = open('abc.gz', 'rb')
cdata = fh.read()
fh.close()
data = do.decompress(cdata)

it throws this error:

zlib.error: Error -3 while decompressing: incorrect header check

How can I overcome it?

回答1:

Update: dnozay's answer explains the problem and should be the accepted answer.

Try the gzip module, code below is straight from the python docs.

import gzip
f = gzip.open('/home/joe/file.txt.gz', 'rb')
file_content = f.read()
f.close()

回答2:

You have this error:

zlib.error: Error -3 while decompressing: incorrect header check

Which is most likely because you are trying to check headers that are not there, e.g. your data follows RFC 1951 (deflate compressed format) rather than RFC 1950 (zlib compressed format) or RFC 1952 (gzip compressed format).

choosing windowBits

But zlib can decompress all those formats:

to (de-)compress deflate format, use wbits = -zlib.MAX_WBITS
to (de-)compress zlib format, use wbits = zlib.MAX_WBITS
to (de-)compress gzip format, use wbits = zlib.MAX_WBITS | 16

See documentation in http://www.zlib.net/manual.html#Advanced (section inflateInit2)

examples

test data:

>>> deflate_compress = zlib.compressobj(9, zlib.DEFLATED, -zlib.MAX_WBITS)
>>> zlib_compress = zlib.compressobj(9, zlib.DEFLATED, zlib.MAX_WBITS)
>>> gzip_compress = zlib.compressobj(9, zlib.DEFLATED, zlib.MAX_WBITS | 16)
>>> 
>>> text = '''test'''
>>> deflate_data = deflate_compress.compress(text) + deflate_compress.flush()
>>> zlib_data = zlib_compress.compress(text) + zlib_compress.flush()
>>> gzip_data = gzip_compress.compress(text) + gzip_compress.flush()
>>>

obvious test for zlib:

>>> zlib.decompress(zlib_data)
'test'

test for deflate:

>>> zlib.decompress(deflate_data)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
zlib.error: Error -3 while decompressing data: incorrect header check
>>> zlib.decompress(deflate_data, -zlib.MAX_WBITS)
'test'

test for gzip:

>>> zlib.decompress(gzip_data)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
zlib.error: Error -3 while decompressing data: incorrect header check
>>> zlib.decompress(gzip_data, zlib.MAX_WBITS|16)
'test'

the data is also compatible with gzip module:

>>> import gzip
>>> import StringIO
>>> fio = StringIO.StringIO(gzip_data)
>>> f = gzip.GzipFile(fileobj=fio)
>>> f.read()
'test'
>>> f.close()

automatic header detection (zlib or gzip)

adding 32 to windowBits will trigger header detection

>>> zlib.decompress(gzip_data, zlib.MAX_WBITS|32)
'test'
>>> zlib.decompress(zlib_data, zlib.MAX_WBITS|32)
'test'

using `gzip` instead

or you can ignore zlib and use gzip module directly; but please remember that under the hood, gzip uses zlib.

fh = gzip.open('abc.gz', 'rb')
cdata = fh.read()
fh.close()

回答3:

I just solved the "incorrect header check" problem when uncompressing gzipped data.

You need to set -WindowBits => WANT_GZIP in your call to inflateInit2 (use the 2 version)

Yes, this can be very frustrating. A typically shallow reading of the documentation presents Zlib as an API to Gzip compression, but by default (not using the gz* methods) it does not create or uncompress the Gzip format. You have to send this non-very-prominently documented flag.

回答4:

Funnily enough, I had that error when trying to work with the Stack Overflow API using Python.

I managed to get it working with the GzipFile object from the gzip directory, roughly like this:

import gzip

gzip_file = gzip.GzipFile(fileobj=open('abc.gz', 'rb'))

file_contents = gzip_file.read()

回答5:

My case was do decompress email messages that are stored in Bullhorn database. The snippet is the following:

import pyodbc
import zlib

cn = pyodbc.connect('connection string')
cursor = cn.cursor()
cursor.execute('SELECT TOP(1) userMessageID, commentsCompressed FROM BULLHORN1.BH_UserMessage WHERE DATALENGTH(commentsCompressed) > 0 ')



 for msg in cursor.fetchall():
    #magic in the second parameter, use negative value for deflate format
    decompressedMessageBody = zlib.decompress(bytes(msg.commentsCompressed), -zlib.MAX_WBITS)

回答6:

Just add headers 'Accept-Encoding': 'identity'

import requests

requests.get('http://gett.bike/', headers={'Accept-Encoding': 'identity'})

https://github.com/requests/requests/issues/3849

zlib.error: Error -3 while decompressing: incorrec

问题:

回答1:

回答2:

choosing windowBits

examples

automatic header detection (zlib or gzip)

using gzip instead

回答3:

回答4:

回答5:

回答6:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

using `gzip` instead