How do I get the URL of an HTTP redirect's tar

2019-06-21 04:13发布

问题:

I am writing client-side Python unit tests to verify whether the HTTP 302 redirects on my Google App Engine site are pointing to the right pages. So far, I have been calling urllib2.urlopen(my_url).geturl(). However, I have encountered 2 issues:

  1. the URL returned by geturl() does not appear to include URL query strings like ?k1=v1&k2=v2; how can I see these? (I need to check whether I correctly passed along the visitor's original URL query string to the redirect page.)
  2. geturl() shows the final URL after any additional redirects. I just care about the first redirect (the one from my site); I am agnostic to anything after that. For example, let's assume my site is example.com. If a user requests http://www.example.com/somepath/?q=foo, I might want to redirect them to http://www.anothersite.com?q=foo. That other site might do another redirect to http://subdomain.anothersite.com?q=foo, which I can't control or predict. How can I make sure my redirect is correct?

回答1:

Supply follow_redirects=False to the fetch function, then retrieve the location of the first redirect from the 'location' header in the response, like so:

response = urlfetch.fetch(your_url, follow_redirects=False)
location = response.headers['Location']


回答2:

Use httplib (and look at the return status and Location header of the response) to avoid the "auto-follow redirects" that's impeding your testing. There's a good example here.