Let's say my models are like this:
class Publisher(models.Model):
name = models.CharField(max_length=30)
code = models.SmallIntegerField(unique=True)
class Book(models.Model):
date = models.DateField(auto_now_add=True)
publisher = models.ForeignKey(Publisher)
hardback = models.BooleanField()
large_print = models.BooleanField()
For a given date range, I want to be able to output a CSV which has total number of books per publisher, and percentage of each boolean field.
Eg:
Publisher-code Total %hardback %large_print:
123 120 32 10
etc
(a) So far I'm working on a view that generates a queryset with the total number of books per publisher
totalset = Publisher.objects.all()
d1 = dict(totalset.annotate(total_books=Count('publisher')).values_list('code','total_books'))
Then gets a dictionary-converted queryset of each boolean field e.g.
d2 = dict(totalset.filter(book__hardback=True).annotate(hardc=Count('book__hardback')).values_list('code','hardc'))
Then get a new dictionary that calculates the percentages based on the intersection of the two sets
d3 = {k: round(float(d2[k])*100/d1[k]) for k in d1.viewkeys() & d2.viewkeys()}
I'm new to all this, so I feel that this is incredibly convoluted. Is there a more straightforward way ??!
(b) If it's possible to do this in the database (e.g. with some sort of model property), is this more efficient than doing it in python as the database gets large ?
Thanks very much
I actually ended up using a model method for the Publisher model; if there's a better way, please let me know !
For a pure SQL solution, I'd probably do a query like this:
Please read up on the
extra()
method of a Django QuerySet andcount()
,over()
of SQL. This is too inefficient because the database is scanned 3 times but it's a start I suppose.