Intermittent SSL Error on Google Cloud SQL with Dj

2019-08-09 06:26发布

问题:

I'm trialling switching to Google Cloud SQL from hosting my own MySQL server for my django application. The application has been running with the same configuration (aside from the MySQL SSL certificates) against a Galera Cluster (also on GCE) for several weeks without the below problem.

I've set up my django application application on a Google Cloud Compute VM, and configured to point at my Cloud SQL instance. I've set up a load balancer (albeit with only 1 VM in the backend for the purposes of this experiment) with a https healthcheck pointing at the login page of my application (checks running every 5 seconds)

I've exported a copy of my database and loaded into Google Cloud SQL, created a user with access from a single IP address and restricted to enforce SSL connections only.

Everything appears to be working fine - I can log into the application and use it within normal parameters, however when I check the apache error logs, I can see intermittent django failures:

[Tue Dec 01 14:33:14.015189 2015] [:error] [pid 1890:tid 139827841804032] ERROR Internal Server Error: /accounts/login/
[Tue Dec 01 14:33:14.016202 2015] [:error] [pid 1890:tid 139827841804032] Traceback (most recent call last):
[Tue Dec 01 14:33:14.016380 2015] [:error] [pid 1890:tid 139827841804032]   File "/home/myuser/.virtualenvs/myapp/lib/python3.4/site-packages/django/db/backends/base/base.py", line 130, in ensure_connection
[Tue Dec 01 14:33:14.016553 2015] [:error] [pid 1890:tid 139827841804032]     self.connect()
[Tue Dec 01 14:33:14.016697 2015] [:error] [pid 1890:tid 139827841804032]   File "/home/myuser/.virtualenvs/myapp/lib/python3.4/site-packages/django/db/backends/base/base.py", line 119, in connect
[Tue Dec 01 14:33:14.016884 2015] [:error] [pid 1890:tid 139827841804032]     self.connection = self.get_new_connection(conn_params)
[Tue Dec 01 14:33:14.017138 2015] [:error] [pid 1890:tid 139827841804032]   File "/home/myuser/.virtualenvs/myapp/lib/python3.4/site-packages/django/db/backends/mysql/base.py", line 276, in get_new_connection
[Tue Dec 01 14:33:14.017296 2015] [:error] [pid 1890:tid 139827841804032]     conn = Database.connect(**conn_params)
[Tue Dec 01 14:33:14.017445 2015] [:error] [pid 1890:tid 139827841804032]   File "/home/myuser/.virtualenvs/myapp/lib/python3.4/site-packages/opbeat/instrumentation/packages/base.py", line 131, in __call__
[Tue Dec 01 14:33:14.017604 2015] [:error] [pid 1890:tid 139827841804032]     args, kwargs)
[Tue Dec 01 14:33:14.017739 2015] [:error] [pid 1890:tid 139827841804032]   File "/home/myuser/.virtualenvs/myapp/lib/python3.4/site-packages/opbeat/instrumentation/packages/base.py", line 222, in call_if_sampling
[Tue Dec 01 14:33:14.017918 2015] [:error] [pid 1890:tid 139827841804032]     return self.call(module, method, wrapped, instance, args, kwargs)
[Tue Dec 01 14:33:14.018156 2015] [:error] [pid 1890:tid 139827841804032]   File "/home/myuser/.virtualenvs/myapp/lib/python3.4/site-packages/opbeat/instrumentation/packages/mysql.py", line 26, in call
[Tue Dec 01 14:33:14.018305 2015] [:error] [pid 1890:tid 139827841804032]     return MySQLConnectionProxy(wrapped(*args, **kwargs))
[Tue Dec 01 14:33:14.018442 2015] [:error] [pid 1890:tid 139827841804032]   File "/home/myuser/.virtualenvs/myapp/lib/python3.4/site-packages/newrelic-2.54.0.41/newrelic/hooks/database_dbapi2.py", line 102, in __call__
[Tue Dec 01 14:33:14.018593 2015] [:error] [pid 1890:tid 139827841804032]     *args, **kwargs), self._nr_dbapi2_module, (args, kwargs))
[Tue Dec 01 14:33:14.018731 2015] [:error] [pid 1890:tid 139827841804032]   File "/home/myuser/.virtualenvs/myapp/lib/python3.4/site-packages/MySQLdb/__init__.py", line 81, in Connect
[Tue Dec 01 14:33:14.018885 2015] [:error] [pid 1890:tid 139827841804032]     return Connection(*args, **kwargs)
[Tue Dec 01 14:33:14.019087 2015] [:error] [pid 1890:tid 139827841804032]   File "/home/myuser/.virtualenvs/myapp/lib/python3.4/site-packages/MySQLdb/connections.py", line 204, in __init__
[Tue Dec 01 14:33:14.019159 2015] [:error] [pid 1890:tid 139827841804032]     super(Connection, self).__init__(*args, **kwargs2)
[Tue Dec 01 14:33:14.019312 2015] [:error] [pid 1890:tid 139827841804032] _mysql_exceptions.OperationalError: (2026, 'SSL connection error: unknown error number')

I am also using celery to run background tasks for the same application. These tasks all involve background manipulation of records through the django models. The celery logs show me that of 7951 tasks in the last hour, 57 have failed with the same 2026/SSL connection error: unknown error. I realise this is <1% error rate, but I won't settle for this when I don't know why it is happening!

Because the error is intermittent, I would have thought it is not a problem with the certificates or configuration (otherwise it would fail all the time?) - Any thoughts on what the problem might be? I don't want to make the switch whole heartedly knowing that as users that using the system that errors may reintroduce.