I've heard suggestions to use the following:
if qs.exists():
...
if qs.count():
...
try:
qs[0]
except IndexError:
...
Copied from comment below: "I'm looking for a statement like "In MySQL and PostgreSQL count() is faster for short queries, exists() is faster for long queries, and use QuerySet[0] when it's likely that you're going to need the first element and you want to check that it exists. However, when count() is faster it's only marginally faster so it's advisable to always use exists() when choosing between the two."
I would imagine that the first method is the most efficient way (you could easily implement it in terms of the second method, so perhaps they are almost identical). The last one requires actually getting a whole object from the database, so it is almost certainly the most expensive.
But, like all of these questions, the only way to know for your particular database, schema and dataset is to test it yourself.
It looks like qs.count() and qs.exists() are effectively equivalent. Therefore I have not discovered a reason to use exists() over count(). The latter is not slower and it can be used to check for both existence and length. It's possible that both exists() and count() evaluate to the same query in MySQL.
Only use
qs[0]
if you actually need the object. It's significantly slower if you're just testing for existence.On Amazon SimpleDB, 400,000 rows:
qs
: 325.00 usec/passqs.exists()
: 144.46 usec/passqs.count()
144.33 usec/passqs[0]
: 324.98 usec/passOn MySQL, 57 rows:
qs
: 1.07 usec/passqs.exists()
: 1.21 usec/passqs.count()
: 1.16 usec/passqs[0]
: 1.27 usec/passI used a random query for each pass to reduce the risk of db-level caching. Test code:
@Sam Odio's solution was a decent starting point but there's a few flaws in the methodology, namely:
So instead of filtering something that might match, I decided to exclude something that definitely won't match, hopefully still avoiding the DB cache, but also ensuring the same number of rows.
I only tested against a local MySQL database, with the dataset:
Timing code:
outputs:
So you can see that
count()
is roughly 9 times slower thanexists()
for this dataset.[0]
is also fast, but it needs exception handling.It depends on use context.
According to documentation:
So, I think that
QuerySet.exists()
is the most recommended way if you just want to check for an empty QuerySet. On the other hand, if you want to use results later, it's better to evaluate it.I also think that your third option is the most expensive, because you need to retrieve all records just to check if any exists.
query.exists()
is the most efficient way.Especially on postgres
count()
can be very expensive, sometimes more expensive then a normal select query.exists()
runs a query with no select_related, field selections or sorting and only fetches a single record. This is much faster then counting the entire query with table joins and sorting.qs[0]
would still includes select_related, field selections and sorting; so it would be more expensive.The Django source code is here (django/db/models/sql/query.py RawQuery.has_results):
https://github.com/django/django/blob/60e52a047e55bc4cd5a93a8bd4d07baed27e9a22/django/db/models/sql/query.py#L499
Another gotcha that got me the other day is invoking a QuerySet in an if statement. That executes and returns the whole query !
If the variable query_set may be
None
(unset argument to your function) then use:not: