cursor.fetchall() vs list(cursor) in Python

2019-01-07 10:56发布

问题:

Both methods return a list of the returned items of the query, did I miss something here?
Or they have identical usages indeed?
Any differences performance-wise?

回答1:

If you are using the default cursor, a MySQLdb.cursors.Cursor, the entire result set will be stored on the client side (i.e. in a Python list) by the time the cursor.execute() is completed.

Therefore, even if you use

for row in cursor:

you will not be getting any reduction in memory footprint. The entire result set has already been stored in a list (See self._rows in MySQLdb/cursors.py).

However, if you use an SSCursor or SSDictCursor:

import MySQLdb
import MySQLdb.cursors as cursors

conn = MySQLdb.connect(..., cursorclass=cursors.SSCursor)

then the result set is stored in the server, mysqld. Now you can write

cursor = conn.cursor()
cursor.execute('SELECT * FROM HUGETABLE')
for row in cursor:
    print(row)

and the rows will be fetched one-by-one from the server, thus not requiring Python to build a huge list of tuples first, and thus saving on memory.

Otherwise, as others have already stated, cursor.fetchall() and list(cursor) are essentially the same.



回答2:

cursor.fetchall() and list(cursor) are essentially the same. The different option is to not retrieve a list, and instead just loop over the bare cursor object:

for result in cursor:

This can be more efficient if the result set is large, as it doesn't have to fetch the entire result set and keep it all in memory; it can just incrementally get each item (or batch them in smaller batches).



回答3:

A (MySQLdb/PyMySQL-specific) difference worth noting when using a DictCursor is that list(cursor) will always give you a list, while cursor.fetchall() gives you a list unless the result set is empty, in which case it gives you an empty tuple. This was the case in MySQLdb and remains the case in the newer PyMySQL, where it will not be fixed for backwards-compatibility reasons. While this isn't a violation of Python Database API Specification, it's still surprising and can easily lead to a type error caused by wrongly assuming that the result is a list, rather than just a sequence.

Given the above, I suggest always favouring list(cursor) over cursor.fetchall(), to avoid ever getting caught out by a mysterious type error in the edge case where your result set is empty.



回答4:

list(cursor) works because a cursor is an iterable; you can also use cursor in a loop:

for row in cursor:
    # ...

A good database adapter implementation will fetch rows in batches from the server, saving on the memory footprint required as it will not need to hold the full result set in memory. cursor.fetchall() has to return the full list instead.

There is little point in using list(cursor) over cursor.fetchall(); the end effect is then indeed the same, but you wasted an opportunity to stream results instead.