I'm completely new to Python's sqlite3 module (and SQL in general for that matter), and this just completely stumps me. The abundant lack of descriptions of cursor
objects (rather, their necessity) also seems odd.
This snippet of code is the preferred way of doing things:
import sqlite3
conn = sqlite3.connect("db.sqlite")
c = conn.cursor()
c.execute('''insert into table "users" values ("Jack Bauer", "555-555-5555")''')
conn.commit()
c.close()
This one isn't, even though it works just as well and without the (seemingly pointless) cursor
:
import sqlite3
conn = sqlite3.connect("db.sqlite")
conn.execute('''insert into table "users" values ("Jack Bauer", "555-555-5555")''')
conn.commit()
Can anyone tell me why I need a cursor
?
It just seems like pointless overhead. For every method in my script that accesses a database, I'm supposed to create and destroy a cursor
?
Why not just use the connection
object?
Just a misapplied abstraction it seems to me. A db cursor is an abstraction, meant for data set traversal.
From Wikipedia article on subject:
In computer science and technology, a database cursor is a control
structure that enables traversal over the records in a database.
Cursors facilitate subsequent processing in conjunction with the
traversal, such as retrieval, addition and removal of database
records. The database cursor characteristic of traversal makes cursors
akin to the programming language concept of iterator.
And:
Cursors can not only be used to fetch data from the DBMS into an
application but also to identify a row in a table to be updated or
deleted. The SQL:2003 standard defines positioned update and
positioned delete SQL statements for that purpose. Such statements do
not use a regular WHERE clause with predicates. Instead, a cursor
identifies the row. The cursor must be opened and already positioned
on a row by means of FETCH statement.
If you check the docs on Python sqlite module, you can see that a python module cursor
is needed even for a CREATE TABLE
statement, so it's used for cases where a mere connection
object should suffice - as correctly pointed out by the OP. Such abstraction is different from what people understand a db cursor to be and hence, the confusion/frustration on the part of users. Regardless of efficiency, it's just a conceptual overhead. Would be nice if it was pointed out in the docs that the python module cursor
is bit different than what a cursor is in SQL and databases.
You need a cursor object to fetch results. Your example works because it's an INSERT
and thus you aren't trying to get any rows back from it, but if you look at the sqlite3
docs, you'll notice that there aren't any .fetchXXXX
methods on connection objects, so if you tried to do a SELECT
without a cursor, you'd have no way to get the resulting data.
Cursor objects allow you to keep track of which result set is which, since it's possible to run multiple queries before you're done fetching the results of the first.
According to the official docs connection.execute()
is a nonstandard shortcut that creates an intermediate cursor object:
Connection.execute
This is a nonstandard shortcut that creates a cursor object by calling the cursor() method, calls the cursor’s execute() method with the parameters given, and returns the cursor.
12.6.8. Using sqlite3 efficiently
12.6.8.1. Using shortcut methods
Using the nonstandard execute()
, executemany()
and executescript()
methods of the Connection object, your code can be written more concisely because you don’t have to create the (often superfluous) Cursor objects explicitly. Instead, the Cursor objects are created implicitly and these shortcut methods return the cursor objects. This way, you can execute a SELECT statement and iterate over it directly using only a single call on the Connection object.
(sqlite3 documentation; emphasis mine.)
Why not just use the connection object?
Because those methods of the connection object are nonstandard, i.e. they are not part of Python Database API Specification v2.0 (PEP 249).
As long as you use the standard methods of the Cursor object, you can be sure that if you switch to another database implementation that follows the above specification, your code will be fully portable. Perhaps you will only need to change the import
line.
But if you use the connection.execute
there is a chance that switching won't be that straightforward. That's the main reasons you might want to use cursor.execute
instead.
However if you are certain that you're not going to switch, I'd say it's completely OK to take the connection.execute
shortcut and be "efficient".
It gives us the ability to have multiple separate working environments through the same connection to the database.