Aggregate difference between DateTime fields in Dj

2019-02-17 09:29发布

I have a table containing a series of entries which relate to time periods (specifically, time worked for a client):

task_time:
id     |    start_time    |    end_time       |    client (fk)
1        08/12/2011 14:48   08/12/2011 14:50     2

I am trying to aggregate all the time worked for a given client, from my Django app:

time_worked_aggregate = models.TaskTime.objects.\
                        filter(client = some_client_id).\
                        extra(select = {'elapsed': 'SUM(task_time.end_time - task_time.start_time)'}).\
                        values('elapsed')

if len(time_worked_aggregate) > 0:
    time_worked = time_worked_aggregate[0]['elapsed'].total_seconds()
else:
    time_worked = 0

This seems inelegant, but it does work. Or at least so I thought: it turns out that it works fine on a PostgreSQL database, but when I move over to SQLite, everything dies.

A bit of digging suggests that the reason for this is that DateTimes aren't first-class data in SQLite. The following raw SQLite query will do my job:

SELECT SUM(strftime('%s', end_time) - strftime('%s', start_time)) FROM task_time WHERE ...;

My question is as follows:

  • The Python sample above seems roundabout. Can we do this more elegantly?
  • More importantly at this stage, can we do it in a way that will work on both Postgres and SQLite? Ideally, I'd like not to be writing raw SQL queries and switching on the database backend that happens to be in place; in general, Django is extremely good at protecting us from this. Does Django have a reasonable abstraction for this operation? If not, what's a sensible way for me to do a conditional switch on the backend?

I should mention for context that the dataset is many thousands of entries; the following is not really practical:

sum([task_time.end_date - task_time.start_date for task_time in models.TaskTime.objects.filter(...)])

3条回答
Luminary・发光体
2楼-- · 2019-02-17 09:42

Almost the same solution as @andri proposed. In the final result you will get the same data. ExpressionWrapper - New in Django 1.8.

from datetime import timedelta
from django.db.models import ExpressionWrapper, F, fields
from app.models import MyModel

duration = ExpressionWrapper(F('closed_at') - F('opened_at'), output_field=fields.DurationField())
objects = MyModel.objects.closed().annotate(duration=duration).filter(duration__gt=timedelta(seconds=2))

for obj in objects:
    print obj.id, obj.duration, obj.duration.seconds

# sample output
# 807 0:00:57.114017 57
# 800 0:01:23.879478 83
# 804 3:40:06.797188 13206
# 801 0:02:06.786300 126
查看更多
姐就是有狂的资本
3楼-- · 2019-02-17 09:56

I think since Django 1.8 we can do better:

I would like just to draw the part with annotation, the further part with aggregation should be straightforward:

from django.db.models import F, Func
SomeModel.objects.annotate(
    duration = Func(F('end_date'), F('start_date'), function='age')
)

[more about postgres age function here: http://www.postgresql.org/docs/8.4/static/functions-datetime.html ]

each instance of SomeModel will be anotated with duration field containg time difference, which in python will be a datetime.timedelta() object [more about datetime timedelta here: https://docs.python.org/2/library/datetime.html#timedelta-objects ]

查看更多
不美不萌又怎样
4楼-- · 2019-02-17 09:58

Django currently only supports aggregates for Min, Max, Avg and Count, so using raw SQL is the only way to achieve what you want. When you use raw SQL, database-independence is out the window, so unfortunately, you're out of luck. You'll have to just detect the database and alter the SQL appropriately.

查看更多
登录 后发表回答