Http Redirection code 3XX in python requests

2019-02-02 05:50发布

I am trying to capture http status code 3XX/302 for a redirection url. But I cannot get it because it gives 200 status code.

Here is the code:

import requests
r = requests.get('http://goo.gl/NZek5')
print r.status_code

I suppose this should issue either 301 or 302 because it redirects to another page. I had tried few redirecting urls (for e.g. http://fb.com ) but again it is issuing the 200. What should be done to capture the redirection code properly?

3条回答
可以哭但决不认输i
2楼-- · 2019-02-02 06:46

requests.get allows for an optional keyword argument allow_redirects which defaults to True. Setting allow_redirects to False will disable automatically following redirects, as follows:

In [1]: import requests
In [2]: r = requests.get('http://goo.gl/NZek5', allow_redirects=False)
In [3]: print r.status_code
301
查看更多
你好瞎i
3楼-- · 2019-02-02 06:54

requests handles redirects for you, see redirection and history.

Set allow_redirects=False if you don't want requests to handle redirections, or you can inspect the redirection responses contained in the r.history list.

Demo:

>>> import requests
>>> r = requests.get('http://goo.gl/NZek5')
>>> r.history
(<Response [301]>,)
>>> r.history[0].status_code
301
>>> r.history[0].headers['Location']
'http://docs.python-requests.org/en/latest/user/quickstart/'
>>> r.url
u'http://docs.python-requests.org/en/latest/user/quickstart/'
>>> r = requests.get('http://goo.gl/NZek5', allow_redirects=False)
>>> r.status_code
301
>>> r.url
u'http://goo.gl/NZek5'

So if allow_redirects is True, the redirects have been followed and the final response returned is the final page after following redirects. If allow_redirects is False, the first response is returned, even if it is a redirect.

查看更多
仙女界的扛把子
4楼-- · 2019-02-02 06:54

This solution will identify the redirect and display the history of redirects, and it will handle common errors. This will ask you for your URL in the console.

import requests

def init():
    console = input("Type the URL: ")
    get_status_code_from_request_url(console)


def get_status_code_from_request_url(url, do_restart=True):
    try:
        r = requests.get(url)
        if len(r.history) < 1:
            print("Status Code: " + str(r.status_code))
        else:
            print("Status Code: 301. Below are the redirects")
            h = r.history
            i = 0
            for resp in h:
                print("  " + str(i) + " - URL " + resp.url + " \n")
                i += 1
        if do_restart:
            init()
    except requests.exceptions.MissingSchema:
        print("You forgot the protocol. http://, https://, ftp://")
    except requests.exceptions.ConnectionError:
        print("Sorry, but I couldn't connect. There was a connection problem.")
    except requests.exceptions.Timeout:
        print("Sorry, but I couldn't connect. I timed out.")
    except requests.exceptions.TooManyRedirects:
        print("There were too many redirects.  I can't count that high.")


init()
查看更多
登录 后发表回答