I use this code (it is only a part) to download *.gz archive.
with requests.session() as s:
s.post(login_to_site_URL, payload)
load = s.get(scene, stream=True)
with open(path_to_file, "wb") as save_command:
for chunk in load.iter_content(chunk_size=1024, decode_unicode=False):
if chunk:
save_command.write(chunk)
save_command.flush()
After download the size of the file is twice more than when I download file by clicking "save as" on it. And the file is corrupted.
Link for the file is:http://www.zsrcpod.aviales.ru/modistlm/archive/tlm/geo/00000/28325/terra_77835_20140806_060059.geo.hdf.gz
File require login and password, so I add a screenshot of what I see when I follow the link: http://i.stack.imgur.com/DGqtS.jpg
Looks like some options set to define this archive as a text.
file.header is:
{'content-length': '58277138',
'content-encoding': 'gzip',
'set-cookie': 'cidaviales=53616c7465645f5fc8f0abdb26f7b0536784ae4e8b302410a288f1f67ccc0afd13ce067d97ba237dc27749d9957f30457f1a1d9763b03637; path=/,
avialestime=1407386483; path=/; expires=Wed,
05-Nov-2014 04:41:23 GMT,
ciddaviales=53616c7465645f5fc8f0abdb26f7b0536784ae4e8b302410a288f1f67ccc0afd13ce067d97ba237dc27749d9957f30457f1a1d9763b03637; domain=aviales.ru; path=/',
'accept-ranges': 'bytes',
'server': 'Apache/1.3.37 (Unix) mod_perl/1.30',
'last-modified': 'Wed, 06 Aug 2014 06:17:14 GMT',
'etag': '"21d4e63-3793d12-53e1c86a"',
'date': 'Thu, 07 Aug 2014 04:41:23 GMT',
'content-type': 'text/plain; charset=windows-1251'}
How to properly download this file using python requests library?