Django weighted query (annotated values)

2019-07-25 03:08发布

问题:

I am attempting to created a query and sort it based on a custom calculation of weights.

I require some help as the solution I have come to does indeed work but the performance is not where I'd like it to be

What I have is a Media object. It has related Comments, Likes and Orders.

What currently works but is a complete hacky mess is the following query:

    products = Media.objects \
        .select_related(
            'image',
            'currency',
            'user',
            'user__image',
        ) \
        .prefetch_related('category', 'tags') \
        .exclude(is_deleted=1) \
        .filter(Q(category__category__in=categories) | Q(tags__tag__title=query)) \
        .annotate(order_count = Count('orders', distinct=True)) \
        .annotate(comment_count = Count('comments', distinct=True)) \
        .annotate(like_count = Count('likes', distinct=True)) \
        .annotate(weight = Count(0)) \
        .distinct()

    for m in products.iterator():
        initial_weight  = int(m.order_count)*40 + int(m.comment_count)*4 + int(m.like_count)*4 + int(m.clicks)
        m.weight        = float(float(initial_weight) - float(m.views/50))

As you can see I am separately annotating all the parameters I'll be using and then doing a stupid iteration full of arithmetic operations for EVERY item in the queryset which is very sub optimal.

One thing I attempted doing was the following:

    products = Media.objects \
        .select_related(
            'image',
            'currency',
            'user',
            'user__image',
        ) \
        .prefetch_related('category', 'tags') \
        .exclude(is_deleted=1) \
        .filter(Q(category__category__in=categories) | Q(tags__tag__title=query)) \
        .annotate(weight = Count('orders', distinct=True) * 40 + Count('comments', distinct=True) * 4 + Count('likes', distinct=True) - F('views')/50 + F('clicks'))

But similar operations in the annotation were impossible (tried a few variations with and without Sum() - Django always complained that the annotated values were of different type.

By the way we're using django 1.8 for this project.

Is there a good single-query aproach to getting my desired sorting weights?

回答1:

First, you would need to make sure that division would yield floats (without rounding). You would need something like this (disgracefully stolen here):

ExpressionWrapper(
    (F('views') / Decimal(50.0), 
    output_field=FloatField()),
)

So, query would look like this:

products = Media.objects \
    .exclude(is_deleted=1) \
    .filter(Q(category__category__in=categories) | Q(tags__tag__title=query)) \
    .annotate(order_count = Count('orders', distinct=True)) \
    .annotate(comment_count = Count('comments', distinct=True)) \
    .annotate(like_count = Count('likes', distinct=True)) \
    .annotate(weight = Count(0)) \
    .annotate(
        initial_weight=ExpressionWrapper(
            F('order_count') * 40 + F('comment_count') * 4 + \
            F('like_count') * 4 + F('clicks'),
            output_field=FloatField()
        )
     ) \
    .annotate(
        views_divided=ExpressionWrapper((F('views') / Decimal(50.0), 
                                        output_field=FloatField()))
     ) \
    .annotate(weight=F('initial_weight') - F('views_divided')) \
    .distinct()

Looks ugly, but should work (I think).

On the side note - if you only need to calculate weight, you don't actually have to use prefetch_related and select_realted, django will take care of that stuff itself (however, that's just my speculation - if you actually use those foreign keys later in the code, then it's justified).