I am trying to access a website to scrape some information, however I am having trouble posting login information through Python. Here is my code so far:
import requests
c = requests.Session()
url = 'https://subscriber.hoovers.com/H/login/login.html'
USERNAME = 'user'
PASSWORD = 'pass'
c.get(url)
csrftoken = c.cookies['csrftoken']
login_data = dict(j_username=USERNAME, j_password=PASSWORD,
csrfmiddlewaretoken=csrftoken, next='/')
c.post(url, data=login_data, headers=dict(Referer=url))
page = c.get('http://subscriber.hoovers.com/H/home/index.html')
print(page.content)
Here is the form data from the post login page:
j_username:user j_password:pass OWASP_CSRFTOKEN:8N0Z-TND5-NV71-C4N4-43BK-B13S-A1MO-NZQC OWASP_CSRFTOKEN:8N0Z-TND5-NV71-C4N4-43BK-B13S-A1MO-NZQC
Here is the error I receive:
Traceback (most recent call last):
File "C:/Users/10023539/Desktop/pyscripts/webscraper ex.py", line 9, in <module>
csrftoken = c.cookies['csrftoken']
File "C:\Program Files (x86)\Python35-32\Lib\site-packages\requests\cookies.py", line 293, in __getitem__
return self._find_no_duplicates(name)
File "C:\Program Files (x86)\Python35-32\Lib\site-packages\requests\cookies.py", line 351, in _find_no_duplicates
raise KeyError('name=%r, domain=%r, path=%r' % (name, domain, path))
KeyError: "name='csrftoken', domain=None, path=None"
I believe the issue has something to do with the 'OWASP_CSRFTOKEN' label? I haven't found any solutions for this specific CSRF name anywhere online. I've also tried removing the c.cookies method and manually typing in the CSRF code into the csrfmiddlewaretoken argument. I've also tried changing the referal URL around, still getting the same error.
Any assistance would be greatly appreciated.
First of all you catch
KeyError
exception, this mean thatcookies
dictionary have no keycsrftoken
.So you need explore your response for find right CSRF token cookie name. For example you can print all cookies:
UPD: Actually your response have no CSRF cookie. you need look token in your
c.text
withpyquery