incremental count and calculation in django model

2019-07-25 17:12发布

Let's say my models are like this:

class Publisher(models.Model):      
    name = models.CharField(max_length=30)
    code = models.SmallIntegerField(unique=True)

class Book(models.Model):
    date = models.DateField(auto_now_add=True)
    publisher = models.ForeignKey(Publisher)
    hardback = models.BooleanField()
    large_print = models.BooleanField()

For a given date range, I want to be able to output a CSV which has total number of books per publisher, and percentage of each boolean field.

Eg:

Publisher-code Total %hardback %large_print:

123 120 32 10

etc

(a) So far I'm working on a view that generates a queryset with the total number of books per publisher

totalset = Publisher.objects.all()
d1 = dict(totalset.annotate(total_books=Count('publisher')).values_list('code','total_books'))

Then gets a dictionary-converted queryset of each boolean field e.g.

d2 = dict(totalset.filter(book__hardback=True).annotate(hardc=Count('book__hardback')).values_list('code','hardc'))

Then get a new dictionary that calculates the percentages based on the intersection of the two sets

d3 = {k: round(float(d2[k])*100/d1[k]) for k in d1.viewkeys() & d2.viewkeys()}

I'm new to all this, so I feel that this is incredibly convoluted. Is there a more straightforward way ??!

(b) If it's possible to do this in the database (e.g. with some sort of model property), is this more efficient than doing it in python as the database gets large ?

Thanks very much

2条回答
放荡不羁爱自由
2楼-- · 2019-07-25 17:35

I actually ended up using a model method for the Publisher model; if there's a better way, please let me know !

def get_percentage(self, d1, d2, choose):
    kwargs = {'book__date__range':[d1,d2], 'book__publisher':self}
    kwargs2 = {'book__date__range':[d1,d2], 'book__publisher':self, choose:True} 
    total_count = Publisher.objects.filter(**kwargs).count()
    if total_count == 0:
        #otherwise perc returns a ZeroDivisionError
        return total_count
    count = Publisher.objects.filter(**kwargs2).count()
    perc = round(float(count) * 100 / float(total_count))
    return perc
查看更多
Rolldiameter
3楼-- · 2019-07-25 17:49

For a pure SQL solution, I'd probably do a query like this:

publishers = Publisher.objects.all().extra(
    select = {
        'book_count': 'SELECT COUNT(*) FROM app_book \
                       WHERE app_book.publisher_id = app_publisher.id',
        'hardback_ratio': 'SELECT COUNT(*) * 100.0 / SUM(COUNT(*)) OVER() \
                           FROM app_book WHERE hardback = TRUE \
                           AND app_book.publisher_id = app_publisher.id',
        'largeprint_ratio': 'SELECT COUNT(*) * 100.0 / SUM(COUNT(*)) OVER() \
                             FROM app_book WHERE largeprint = TRUE \
                             AND app_book.publisher_id = app_publisher.id',
    }
)

Please read up on the extra() method of a Django QuerySet and count(), over() of SQL. This is too inefficient because the database is scanned 3 times but it's a start I suppose.

查看更多
登录 后发表回答