Trying to download and process jpegs from URLs. My issue isn't that certificate verification fails for some URLs, as these URLs are old and may no longer be trustworthy, but that when I try...except...
the SSLCertVerificationError
, I still get the traceback.
System:
Linux 4.17.14-arch1-1-ARCH, python 3.7.0-3, aiohttp 3.3.2
Minimal example:
import asyncio
import aiohttp
from ssl import SSLCertVerificationError
async def fetch_url(url, client):
try:
async with client.get(url) as resp:
print(resp.status)
print(await resp.read())
except SSLCertVerificationError as e:
print('Error handled')
async def main(urls):
tasks = []
async with aiohttp.ClientSession(loop=loop) as client:
for url in urls:
task = asyncio.ensure_future(fetch_url(url, client))
tasks.append(task)
return await asyncio.gather(*tasks)
loop = asyncio.get_event_loop()
loop.run_until_complete(main(['https://images.photos.com/']))
Output:
SSL handshake failed on verifying the certificate
protocol: <asyncio.sslproto.SSLProtocol object at 0x7ffbecad8ac8>
transport: <_SelectorSocketTransport fd=6 read=polling write=<idle, bufsize=0>>
Traceback (most recent call last):
File "/usr/lib/python3.7/asyncio/sslproto.py", line 625, in _on_handshake_complete
raise handshake_exc
File "/usr/lib/python3.7/asyncio/sslproto.py", line 189, in feed_ssldata
self._sslobj.do_handshake()
File "/usr/lib/python3.7/ssl.py", line 763, in do_handshake
self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: Hostname mismatch, certificate is not valid for 'images.photos.com'. (_ssl.c:1045)
SSL error in data received
protocol: <asyncio.sslproto.SSLProtocol object at 0x7ffbecad8ac8>
transport: <_SelectorSocketTransport closing fd=6 read=idle write=<idle, bufsize=0>>
Traceback (most recent call last):
File "/usr/lib/python3.7/asyncio/sslproto.py", line 526, in data_received
ssldata, appdata = self._sslpipe.feed_ssldata(data)
File "/usr/lib/python3.7/asyncio/sslproto.py", line 189, in feed_ssldata
self._sslobj.do_handshake()
File "/usr/lib/python3.7/ssl.py", line 763, in do_handshake
self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: Hostname mismatch, certificate is not valid for 'images.photos.com'. (_ssl.c:1045)
Error handled
The traceback is generated by asyncio's implementation of the SSL protocol, which invokes the event loop's exception handler. Through a maze of interactions between the transport and the streaming interface, it happens that this exception is both logged by the event loop and propagated to the API user. The way it happens is as follows:
- An exception occurs during the SSL handshake.
SSLProtocol._on_handshake_complete
receives non-None handshake_exc
and treats it as a "fatal error" (in the handshake context), i.e. invokes self._fatal_error
and returns.
_fatal_error
calls the event loop's exception handler to log the error. The handler is normally invoked for exceptions that occur in queued callbacks where there is no longer a caller to propagate them to, so it just logs the traceback to standard error to ensure that the exception doesn't pass silently. However...
_fatal_error
goes on to call transport._force_close
, which calls connection_lost
back on the protocol.
- The stream reader protocol's
connection_lost
implementation sets the exception as the result of the stream reader's future, thus propagating it to the users of the stream API that await it.
It is not obvious if it is a bug or a feature that the same exception is both logged by the event loop and passed to connection_lost
. It might be a workaround for BaseProtocol.connection_lost
being defined a no-op, so the extra log ensures that a protocol that simply inherits from BaseProtocol
doesn't silence the possibly sensitive exceptions occurring during SSL handshake. Whichever the reason, the current behavior leads to the problem experienced by the OP: catching the exception is not enough to suppress it, a traceback will still be logged.
To work around the issue, one can temporarily set the exception handler to one that doesn't report SSLCertVerificationError
:
@contextlib.contextmanager
def suppress_ssl_exception_report():
loop = asyncio.get_event_loop()
old_handler = loop.get_exception_handler()
old_handler_fn = old_handler or lambda _loop, ctx: loop.default_exception_handler(ctx)
def ignore_exc(_loop, ctx):
exc = ctx.get('exception')
if isinstance(exc, SSLCertVerificationError):
return
old_handler_fn(loop, ctx)
loop.set_exception_handler(ignore_exc)
try:
yield
finally:
loop.set_exception_handler(old_handler)
Adding with suppress_ssl_exception_report()
around the code in fetch_url
suppresses the unwanted traceback.
The above works, but it strongly feels like a workaround for an underlying issue and not like correct API usage, so I filed a bug report in the tracker.
For unknown reason (bug?) aiohttp prints error output to console even before any exception thrown. You can avoid it temporary redirecting error output with contextlib.redirect_stderr:
import asyncio
import aiohttp
from ssl import SSLCertVerificationError
import os
from contextlib import redirect_stderr
async def fetch_url(url, client):
try:
f = open(os.devnull, 'w')
with redirect_stderr(f): # ignore any error output inside context
async with client.get(url) as resp:
print(resp.status)
print(await resp.read())
except SSLCertVerificationError as e:
print('Error handled')
# ...
P.S. I think you can use more common exception type to catch client errors, for example:
except aiohttp.ClientConnectionError as e:
print('Error handled')