Annotating SUM aggregation function leading to 

2019-06-16 03:37发布

问题:

Doing my first real Django project, and need guidance.

Background: My project is a reddit clone. Users submit links+text. Visitors upvote or downvote. There's a social_ranking algo, runs every ~2 mins as a background script, reranks all the submissions according to net votes and freshness of content. Fairly vanilla stuff.

Problem: Ordering by votes isn't working correctly, because votes are being initialized as None instead of 0. This causes submissions with None votes to rank below submissions with negative votes. I've debugged this issue for days - no luck.

Specifics: I've over-ridden my model's model manager to annotate a Sum aggregation function to the query set, and then order the said query set by 'social rank' and votes.

Below is my models.py. I'm using Django 1.5, thus some stuff you see here may not correspond to 1.8 (e.g. get_query_set vs get_queryset):

class LinkVoteCountManager(models.Manager):
    def get_query_set(self):
        return super(LinkVoteCountManager, self).get_query_set().annotate(votes=Sum('vote__value')).order_by('-rank_score', '-votes') 

class Link(models.Model):
    description = models.TextField(_("Write something"))
    submitter = models.ForeignKey(User)
    submitted_on = models.DateTimeField(auto_now_add=True)
    rank_score = models.FloatField(default=0.0)
    url = models.URLField(_("Link"), max_length=250, blank=True)

    with_votes = LinkVoteCountManager() 
    objects = models.Manager() 

    def __unicode__(self): 
        return self.description

    def set_rank(self):
        # Based on reddit ranking algo at http://amix.dk/blog/post/19588
        epoch = datetime(1970, 1, 1).replace(tzinfo=None)
        netvotes = self.votes # 'NONE' votes are messing up netvotes amount.
        if netvotes == None:
            netvotes = 0
        order = log(max(abs(netvotes), 1), 10)
        sign = 1 if netvotes > 0 else -1 if netvotes < 0 else 0
        unaware_submission = self.submitted_on.replace(tzinfo=None)
        td = unaware_submission - epoch 
        epoch_submission = td.days * 86400 + td.seconds + (float(td.microseconds) / 1000000)
        secs = epoch_submission - 1432201843
        self.rank_score = round(sign * order + secs / 45000, 8)
        self.save()

class Vote(models.Model):
    voter = models.ForeignKey(User)
    link = models.ForeignKey(Link)
    value = models.IntegerField(null=True, blank=True, default=0)

    def __unicode__(self):
        return "%s gave %s to %s" % (self.voter.username, self.value, self.link.description)

If needed, the following are relevant sections from my views.py:

class LinkListView(ListView):
    model = Link
    queryset = Link.with_votes.all()
    paginate_by = 10

    def get_context_data(self, **kwargs):
        context = super(LinkListView, self).get_context_data(**kwargs)
        if self.request.user.is_authenticated():
            voted = Vote.objects.filter(voter=self.request.user)
            links_in_page = [link.id for link in context["object_list"]]
            voted = voted.filter(link_id__in=links_in_page)
            voted = voted.values_list('link_id', flat=True)
            context["voted"] = voted
        return context

class LinkCreateView(CreateView):
    model = Link
    form_class = LinkForm

    def form_valid(self, form):
        f = form.save(commit=False)
        f.rank_score=0
        f.with_votes = 0
        f.category = '1'
        f.save()
        return super(CreateView, self).form_valid(form)

Can anyone shed light on what I need to do to fix the "None" issue? Thanks in advance.

回答1:

Just hit the same wall, although I chose to ignore None entries by excluding them out of the results. Guess you don't want that.

BTW, this question has the same issue Annotating a Sum results in None rather than zero

As for the solution other than using a custom sql as pointed out in that question's answer, you can use Django 1.8 instead and go for the solution pointed out in the ticket open in Django's bug tracker for over 6 years(!) https://code.djangoproject.com/ticket/10929

Coalesce(Sum('field'), 0)

So your manager would be:

class LinkVoteCountManager(models.Manager):
    def get_query_set(self):
        return super(LinkVoteCountManager, self).get_query_set().annotate(
            votes=Coalesce(Sum('vote__value'), 0)
        ).order_by(
            '-rank_score', 
            '-votes'
        )

PS: I have not tested the code, since I'm not using Django 1.8 myself.



回答2:

You could also replace the line

netvotes = self.votes

to

netvotes = self.votes or 0

and you can now remove the if statement.

What this does as in many other languages is to return the non falsy value (None, 0, ""), or the last value, '0' in this particular case.