I am trying to open an https URL using the urlopen
method in Python 3's urllib.request
module. It seems to work fine, but the documentation warns that "[i]f neither cafile
nor capath
is specified, an HTTPS request will not do any verification of the server’s certificate".
I am guessing I need to specify one of those parameters if I don't want my program to be vulnerable to man-in-the-middle attacks, problems with revoked certificates, and other vulnerabilities.
cafile
and capath
are supposed to point to a list of certificates. Where am I supposed to get this list from? Is there any simple and cross-platform way to use the same list of certificates that my OS or browser uses?
Works in python 2.7 and above
You can download the certificates Mozilla in a format usable for urllib (e.g. PEM format) at http://curl.haxx.se/docs/caextract.html
Elias Zamarias answer still works, but gives a deprecation warning:
I was able to solve the same problem this way instead (using Python 3.7.0):
I found a library that does what I'm trying to do: Certifi. It can be installed by running
pip install certifi
from the command line.Making requests and verifying them is now easy:
As I expected, this returns a
HTTPResponse
object for a site with a valid certificate and raises assl.CertificateError
exception for a site with an invalid certificate.Different Linux distributives have different pack names. I tested in Centos and Ubuntu. These certificate bundles are updates with system update. So you may just detect which bundle is available and use it with
urlopen
.