Looking at the source of urllib2 it looks like the easiest way to do it would be to subclass HTTPRedirectHandler and then use build_opener to override the default HTTPRedirectHandler, but this seems like a lot of (relatively complicated) work to do what seems like it should be pretty simple.
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
回答1:
Here is the Requests way:
import requests
r = requests.get('http://github.com', allow_redirects=False)
print(r.status_code, r.headers['Location'])
回答2:
Dive Into Python has a good chapter on handling redirects with urllib2. Another solution is httplib.
>>> import httplib
>>> conn = httplib.HTTPConnection("www.bogosoft.com")
>>> conn.request("GET", "")
>>> r1 = conn.getresponse()
>>> print r1.status, r1.reason
301 Moved Permanently
>>> print r1.getheader('Location')
http://www.bogosoft.com/new/location
回答3:
This is a urllib2 handler that will not follow redirects:
class NoRedirectHandler(urllib2.HTTPRedirectHandler):
def http_error_302(self, req, fp, code, msg, headers):
infourl = urllib.addinfourl(fp, headers, req.get_full_url())
infourl.status = code
infourl.code = code
return infourl
http_error_300 = http_error_302
http_error_301 = http_error_302
http_error_303 = http_error_302
http_error_307 = http_error_302
opener = urllib2.build_opener(NoRedirectHandler())
urllib2.install_opener(opener)
回答4:
i suppose this would help
from httplib2 import Http
def get_html(uri,num_redirections=0): # put it as 0 for not to follow redirects
conn = Http()
return conn.request(uri,redirections=num_redirections)
回答5:
The redirections
keyword in the httplib2
request method is a red herring. Rather than return the first request it will raise a RedirectLimit
exception if it receives a redirection status code. To return the inital response you need to set follow_redirects
to False
on the Http
object:
import httplib2
h = httplib2.Http()
h.follow_redirects = False
(response, body) = h.request("http://example.com")
回答6:
I second olt's pointer to Dive into Python. Here's an implementation using urllib2 redirect handlers, more work than it should be? Maybe, shrug.
import sys
import urllib2
class RedirectHandler(urllib2.HTTPRedirectHandler):
def http_error_301(self, req, fp, code, msg, headers):
result = urllib2.HTTPRedirectHandler.http_error_301(
self, req, fp, code, msg, headers)
result.status = code
raise Exception("Permanent Redirect: %s" % 301)
def http_error_302(self, req, fp, code, msg, headers):
result = urllib2.HTTPRedirectHandler.http_error_302(
self, req, fp, code, msg, headers)
result.status = code
raise Exception("Temporary Redirect: %s" % 302)
def main(script_name, url):
opener = urllib2.build_opener(RedirectHandler)
urllib2.install_opener(opener)
print urllib2.urlopen(url).read()
if __name__ == "__main__":
main(*sys.argv)
回答7:
The shortest way however is
class NoRedirect(urllib2.HTTPRedirectHandler):
def redirect_request(self, req, fp, code, msg, hdrs, newurl):
pass
noredir_opener = urllib2.build_opener(NoRedirect())