I want to download and parse webpage using python, but to access it I need a couple of cookies set. Therefore I need to login over https to the webpage first. The login moment involves sending two POST params (username, password) to /login.php. During the login request I want to retrieve the cookies from the response header and store them so I can use them in the request to download the webpage /data.php.
How would I do this in python (preferably 2.6)? If possible I only want to use builtin modules.
import urllib, urllib2, cookielib
username = \'myuser\'
password = \'mypassword\'
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
login_data = urllib.urlencode({\'username\' : username, \'j_password\' : password})
opener.open(\'http://www.example.com/login.php\', login_data)
resp = opener.open(\'http://www.example.com/hiddenpage.php\')
print resp.read()
resp.read()
is the straight html of the page you want to open, and you can use opener
to view any page using your session cookie.
Here\'s a version using the excellent requests library:
from requests import session
payload = {
\'action\': \'login\',
\'username\': USERNAME,
\'password\': PASSWORD
}
with session() as c:
c.post(\'http://example.com/login.php\', data=payload)
response = c.get(\'http://example.com/protected_page.php\')
print(response.headers)
print(response.text)