Python: urllib/urllib2/httplib confusion

2019-01-12 16:47发布

I'm trying to test the functionality of a web app by scripting a login sequence in Python, but I'm having some troubles.

Here's what I need to do:

  1. Do a POST with a few parameters and headers.
  2. Follow a redirect
  3. Retrieve the HTML body.

Now, I'm relatively new to python, but the two things I've tested so far haven't worked. First I used httplib, with putrequest() (passing the parameters within the URL), and putheader(). This didn't seem to follow the redirects.

Then I tried urllib and urllib2, passing both headers and parameters as dicts. This seems to return the login page, instead of the page I'm trying to login to, I guess it's because of lack of cookies or something.

Am I missing something simple?

Thanks.

8条回答
做个烂人
2楼-- · 2019-01-12 17:22

@S.Lott, thank you. Your suggestion worked for me, with some modification. Here's how I did it.

data = urllib.urlencode(params)
url = host+page
request = urllib2.Request(url, data, headers)
response = urllib2.urlopen(request)

cookies = CookieJar()
cookies.extract_cookies(response,request)

cookie_handler= urllib2.HTTPCookieProcessor( cookies )
redirect_handler= HTTPRedirectHandler()
opener = urllib2.build_opener(redirect_handler,cookie_handler)

response = opener.open(request)
查看更多
Fickle 薄情
3楼-- · 2019-01-12 17:22

I'd give Mechanize (http://wwwsearch.sourceforge.net/mechanize/) a shot. It may well handle your cookie/headers transparently.

查看更多
登录 后发表回答