I'm attempting to connect to a website that requires you to have a specific cookie to access it. For the sake of this question, we'll call the cookie 'required_cookie' and the value 'required_value'.
This is my code:
import urllib
import http.cookiejar
cj = http.cookiejar.CookieJar()
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
opener.addheaders = [('required_cookie', 'required_value'), ('User-Agent', 'Mozilla/5.0')]
urllib.request.install_opener(opener)
req = Request('https://www.thewebsite.com/')
webpage = urlopen(req).read()
print(webpage)
I'm new to urllib so please answer me as a beginner
To do this with
urllib
, you need to:Cookie
object. The constructor isn't documented in the docs, but if youhelp(http.cookiejar.Cookie)
in the interactive interpreter, you can see that its constructor demands values for all 16 attributes. Notice that the docs say, "It is not expected that users of http.cookiejar construct their own Cookie instances."cj.set_cookie(cookie)
.cj.add_cookie_headers(req)
.Assuming you've configured the policy correctly, you're set.
But this is a huge pain. As the docs for
urllib.request
say:And, unless you have some good reason you can't install
requests
, you really should go that way.urllib
is tolerable for really simple cases, and it can be handy when you need to get deep under the covers—but for everything else,requests
is much better.With
requests
, your whole program becomes a one-liner:… although it's probably more readable as a few lines: