Cannot login to website using requests module (Pyt

2019-09-12 01:08发布

问题:

I am trying to access a website to scrape some information, however I am having trouble posting login information through Python. Here is my code so far:

import requests

c = requests.Session()
url = 'https://subscriber.hoovers.com/H/login/login.html'
USERNAME = 'user'
PASSWORD = 'pass'

c.get(url)
csrftoken = c.cookies['csrftoken']
login_data = dict(j_username=USERNAME, j_password=PASSWORD,           
csrfmiddlewaretoken=csrftoken, next='/')
c.post(url, data=login_data, headers=dict(Referer=url))
page = c.get('http://subscriber.hoovers.com/H/home/index.html')
print(page.content)

Here is the form data from the post login page:

j_username:user j_password:pass OWASP_CSRFTOKEN:8N0Z-TND5-NV71-C4N4-43BK-B13S-A1MO-NZQC OWASP_CSRFTOKEN:8N0Z-TND5-NV71-C4N4-43BK-B13S-A1MO-NZQC

Here is the error I receive:

Traceback (most recent call last):
  File "C:/Users/10023539/Desktop/pyscripts/webscraper ex.py", line 9, in <module>
    csrftoken = c.cookies['csrftoken']
  File "C:\Program Files (x86)\Python35-32\Lib\site-packages\requests\cookies.py", line 293, in __getitem__
    return self._find_no_duplicates(name)
  File "C:\Program Files (x86)\Python35-32\Lib\site-packages\requests\cookies.py", line 351, in _find_no_duplicates
    raise KeyError('name=%r, domain=%r, path=%r' % (name, domain, path))
KeyError: "name='csrftoken', domain=None, path=None"

I believe the issue has something to do with the 'OWASP_CSRFTOKEN' label? I haven't found any solutions for this specific CSRF name anywhere online. I've also tried removing the c.cookies method and manually typing in the CSRF code into the csrfmiddlewaretoken argument. I've also tried changing the referal URL around, still getting the same error.

Any assistance would be greatly appreciated.

回答1:

First of all you catch KeyError exception, this mean that cookies dictionary have no key csrftoken.

So you need explore your response for find right CSRF token cookie name. For example you can print all cookies:

for key in c.cookies.keys():
    print('%s: %s' % (key, c.cookies[key]))

UPD: Actually your response have no CSRF cookie. you need look token in your c.text with pyquery

<input type="hidden" name="OWASP_CSRFTOKEN" class="csrfClass" value="X48L-NEYI-CG18-SJOD-VDW9-FGEB-7WIT-88P4">