From time to time, I'm executing raw queries using connection.cursor()
instead of using ORM (since it is definitely not a silver bullet).
I've noticed that in several places I don't call explicit cursor.close()
after I'm done with database. So far, this hasn't result into any errors, or performance issues. I'm wondering what kind of problems could I possibly have without closing the cursor explicitly, what can go wrong?
As far as I understand, connection
and cursor
in Django follow "Python Database API Specification v2.0" (PEP-249). And, according to it, cursor
would be automatically closed whenever __del__()
method is called. I guess the question could be also: Is there a use case when it is not called?
FYI, I'm using Python 2.7 and Django 1.6.5.
__del__
/.close()
:
__del__
is not guaranteed to be called
- some databases don't call cursor.close() in their
__del__
(bad practice, but true)
- some databases don't actually create connections in the connection function, but in the cursor function instead (e.g. for 2&3: pyhive's presto [maybe they've since patched it])
On server connections in general
Most servers have an idle timeout configuration property (let's call that T). If a connection is idle for more than T seconds, the server will remove the connection. Most servers also have properties to set the size of the worker thread pool (W). If you already have W connections to your server, it will likely hang when a new connection is attempted. For a second imagine that you don't have the option to explicitly close connections. In that situation, you have to set the timeout to be small enough that your worker pool is never completely used, which is a function of how many concurrent connections you have.
However, if you do close your cursors/connections (even when not equivalent by [3] above, they behave in a similar way), then you don't have to manage these server configuration properties and your thread pool simply needs to be large enough to manage all concurrent connections (with the option for an occasional wait for new resources). I've seen some servers (e.g. Titan on Cassandra) unable to recover from running out of workers in the thread pool, so the whole server goes down until restarted.
TL/DR
If you're using very well-developed libraries, like the ones dano
mentions, you won't have an issue. If you're using less pristine libraries, you may end up blocking on the server acquiring a worker thread if you don't call .close()
, depending on your server config and access rates.
Django's cursor
class is just a wrapper around the underlying DB's cursor
, so the effect of leaving the cursor
open is basically tied to the underlying DB driver.
According to psycopg2's (psycopg2 is DB driver Django uses for PostgreSQL DB's) FAQ, their cursors are lightweight, but will cache the data being returned from queries you made using the cursor object, which could potentially waste memory:
Cursors are lightweight objects and creating lots of them should not
pose any kind of problem. But note that cursors used to fetch result
sets will cache the data and use memory in proportion to the result
set size. Our suggestion is to almost always create a new cursor and
dispose old ones as soon as the data is not required anymore (call
close() on them.) The only exception are tight loops where one usually
use the same cursor for a whole bunch of INSERTs or UPDATEs.
Django uses MySQLdb
as the backend for MySQL, which has several different types of cursors, including some that actually store their result-sets on the server-side. The MySQLdb
documentation for Cursor.close
make a point to note that it's very important to close the server-side cursor's when you're done with them:
If you are using server-side cursors, it is very important to close
the cursor when you are done with it and before creating a new one.
However, this isn't relevant for Django, because it uses the default Cursor
class provided by MySQLdb
, which stores results on the client-side. Leaving a used cursor open just risks wasting the memory used by the stored result-set, just like psycopg2
. The close
method on the cursor just deletes the internal reference to the db connection and exhausts the stored result set:
def close(self):
"""Close the cursor. No further queries will be possible."""
if not self.connection: return
while self.nextset(): pass
self.connection = None
As best as I can tell from looking at their source, all the remaining backends used by Django (cx_oracle, sqlite3/pysqlite2) all follow the same pattern; free memory by deleting/resetting stored results/object references. The sqlite3 docs don't even mention that the Cursor
class has a close method, and it's only used sporadically in the included example code.
You are right that a cursor
will be closed when __del__()
is called on the cursor
object, so the need to explicitly close is only an issue if you're keeping a long-living reference to the cursor
; e.g. a self.cursor
object that you're keeping as an instance method of a class.
While the operating system can normally be relied upon to release resources, it's always good hygiene to close things like database connections to ensure resources are released when no longer required, the really important thing from a database point of view is to ensure that any changes are commit()
ed.
The explicit calling of cursor.close()
might be because of two reasons:
__del__
is not guaranteed to be called and has some issues you can read here and here
- Explicit is better than implicit (Zen of Python)
I'm a bit late to this question. Maybe a close-on-exit-scope is what you want.
from contextlib import closing
from django.db import connection
with closing(connection.cursor()) as cursor:
cursor.execute(...)
cursor.execute(...)
cursor.execute(...)