Using Django ORM, can one do something like queryset.objects.annotate(Count('queryset_objects', gte=VALUE))
. Catch my drift?
Here's a quick example to use for illustrating a possible answer:
In a Django website, content creators submit articles, and regular users view (i.e. read) the said articles. Articles can either be published (i.e. available for all to read), or in draft mode. The models depicting these requirements are:
class Article(models.Model):
author = models.ForeignKey(User)
published = models.BooleanField(default=False)
class Readership(models.Model):
reader = models.ForeignKey(User)
which_article = models.ForeignKey(Article)
what_time = models.DateTimeField(auto_now_add=True)
My question is: How can I get all published articles, sorted by unique readership from the last 30 mins? I.e. I want to count how many distinct (unique) views each published article got in the last half an hour, and then produce a list of articles sorted by these distinct views.
I tried:
date = datetime.now()-timedelta(minutes=30)
articles = Article.objects.filter(published=True).extra(select = {
"views" : """
SELECT COUNT(*)
FROM myapp_readership
JOIN myapp_article on myapp_readership.which_article_id = myapp_article.id
WHERE myapp_readership.reader_id = myapp_user.id
AND myapp_readership.what_time > %s """ % date,
}).order_by("-views")
This sprang the error: syntax error at or near "01" (where "01" was the datetime object inside extra). It's not much to go on.
For django >= 2.0 you can use Conditional aggregation with a
filter
argument in the aggregate functions:For django >= 1.8
Use Conditional Aggregation:
Explanation: normal query through your articles will be annotated with
numviews
field. That field will be constructed as a CASE/WHEN expression, wrapped by Count, that will return 1 for readership matching criteria andNULL
for readership not matching criteria. Count will ignore nulls and count only values.You will get zeros on articles that haven't been viewed recently and you can use that
numviews
field for sorting and filtering.Query behind this for PostgreSQL will be:
If we want to track only unique queries, we can add distinction into
Count
, and make ourWhen
clause to return value, we want to distinct on.That will produce:
For django < 1.8 and PostgreSQL
You can just use
raw
for executing SQL statement created by newer versions of django. Apparently there is no simple and optimized method for querying that data without usingraw
(even withextra
there are some problems with injecting requiredJOIN
clause).