I'm using a proxy (behind corporate firewall), to login to an https domain. The SSL handshake doesn't seem to be going well:
CertificateError: hostname 'ats.finra.org:443' doesn't match 'ats.finra.org'
I'm using Python 2.7.9 - Mechanize and I've gotten past all of the login, password, security questioon screens, but it is getting hung up on the certification.
Any help would be amazing. I've tried the monkeywrench found here: Forcing Mechanize to use SSLv3
Doesn't work for my code though.
If you want the code file I'd be happy to send.
You can avoid this error by monkey patching ssl:
import ssl
ssl.match_hostname = lambda cert, hostname: True
This bug in ssl.math_hostname appears in v2.7.9 (it's not in 2.7.5), and has to do with not stripping out the hostname from the hostname:port syntax. The following rewrite of ssl.match_hostname fixes the bug. Put this before your mechanize code:
import functools, re, urlparse
import ssl
old_match_hostname = ssl.match_hostname
@functools.wraps(old_match_hostname)
def match_hostname_bugfix_ssl_py_2_7_9(cert, hostname):
m = re.search(r':\d+$',hostname) # hostname:port
if m is not None:
o = urlparse.urlparse('https://' + hostname)
hostname = o.hostname
old_match_hostname(cert, hostname)
ssl.match_hostname = match_hostname_bugfix_ssl_py_2_7_9
The following mechanize code should now work:
import mechanize
import cookielib
br = mechanize.Browser()
# Cookie Jar
cj = cookielib.LWPCookieJar()
br.set_cookiejar(cj)
# Browser options
br.set_handle_equiv(True)
br.set_handle_gzip(True)
br.set_handle_redirect(True)
br.set_handle_referer(True)
br.set_handle_robots(False)
# Follows refresh 0 but not hang on refresh > 0
br.set_handle_refresh(mechanize._http.HTTPRefreshProcessor(), max_time=1)
br.addheaders = [('User-Agent', 'Nutscrape 1.0')]
# Use this proxy
br.set_proxies({"http": "localhost:3128", "https": "localhost:3128"})
r = br.open('https://www.duckduckgo.com:443/')
html = br.response().read()
# Examine the html response from a browser
f = open('foo.html','w')
f.write(html)
f.close()
In my case the certificate's DNS name was ::1
(for local testing purposes) and hostname verification failed with
ssl.CertificateError: hostname '::1' doesn't match '::1'
To fix it somewhat properly I monkey patched ssl.match_hostname
with
import ssl
ssl.match_hostname = lambda cert, hostname: hostname == cert['subjectAltName'][0][1]
Which actually checks if the hostnames match.