There is a link with a gif image, but urllib2 can't download it.
import urllib.request as urllib2
uri = 'http://ums.adtechjp.com/mapuser?providerid=1074;userid=AapfqIzytwl7ks8AA_qiU_BNUs8AAAFYqnZh4Q'
try:
req = urllib2.Request(uri, headers={ 'User-Agent': 'Mozilla/5.0' })
file = urllib2.urlopen(req)
except urllib2.HTTPError as err:
print('HTTP error!!!')
file = err
print(err.code)
except urllib2.URLError as err:
print('URL error!!!')
print(err.reason)
return
data = file.read(1024)
print(data)
After script finishes, data remains empty. Why does it happen? There is no HTTPError, I can see in browser console that there is a valid gif and status of HTTP responce is 200 OK. Thank you.
You should check all headers which browser sends to server.
This page needs two headers: User-Agent
and Cookie
If you use DevTools
in Chrome or Firefox you will see that normally browser (if it has no cookie yet) receives first response with cookie and 302 Moved Temporarily
which redirects to the same url but with cookie and then it receives image.
You can try my cookie and maybe it receives image. Bu normally you have to do two requests - first to get cookie and second (with cookie) to get image.
import urllib.request as urllib2
uri = 'http://ums.adtechjp.com/mapuser?providerid=1074;userid=AapfqIzytwl7ks8AA_qiU_BNUs8AAAFYqnZh4Q'
headers = {
'User-Agent': 'Mozilla/5.0',
'Cookie': 'JEB2=583077046E650E2495131DE8FD2F1371',
}
try:
req = urllib2.Request(uri, headers=headers)
f = urllib2.urlopen(req)
except urllib2.HTTPError as err:
print('HTTP error!!!')
f = err
print(err.code)
except urllib2.URLError as err:
print('URL error!!!')
print(err.reason)
data = f.read(1024)
print(data)
If you use requests
module then it will do all automatically and you will no need two requests.
import requests
uri = 'http://ums.adtechjp.com/mapuser?providerid=1074;userid=AapfqIzytwl7ks8AA_qiU_BNUs8AAAFYqnZh4Q'
headers = {
'User-Agent': 'Mozilla/5.0',
}
r = requests.get(uri, headers=headers)
print(r.content)