How to combine 2 or more querysets in a Django vie

2018-12-31 03:23发布

I am trying to build the search for a Django site I am building, and in the search I am searching in 3 different models. And to get pagination on the search result list I would like to use a generic object_list view to display the results. But to do that i have to merge 3 querysets into one.

How can i do that? I've tried this:

result_list = []            
page_list = Page.objects.filter(
    Q(title__icontains=cleaned_search_term) | 
    Q(body__icontains=cleaned_search_term))
article_list = Article.objects.filter(
    Q(title__icontains=cleaned_search_term) | 
    Q(body__icontains=cleaned_search_term) | 
    Q(tags__icontains=cleaned_search_term))
post_list = Post.objects.filter(
    Q(title__icontains=cleaned_search_term) | 
    Q(body__icontains=cleaned_search_term) | 
    Q(tags__icontains=cleaned_search_term))

for x in page_list:
    result_list.append(x)
for x in article_list:
    result_list.append(x)
for x in post_list:
    result_list.append(x)

return object_list(
    request, 
    queryset=result_list, 
    template_object_name='result',
    paginate_by=10, 
    extra_context={
        'search_term': search_term},
    template_name="search/result_list.html")

But this doesn't work I get an error when I try to use that list in the generic view. The list is missing the clone attribute.

Anybody know how I can merge the three lists, page_list, article_list and post_list?

11条回答
有味是清欢
2楼-- · 2018-12-31 03:57

You can use the QuerySetChain class below. When using it with Django's paginator, it should only hit the database with COUNT(*) queries for all querysets and SELECT() queries only for those querysets whose records are displayed on the current page.

Note that you need to specify template_name= if using a QuerySetChain with generic views, even if the chained querysets all use the same model.

from itertools import islice, chain

class QuerySetChain(object):
    """
    Chains multiple subquerysets (possibly of different models) and behaves as
    one queryset.  Supports minimal methods needed for use with
    django.core.paginator.
    """

    def __init__(self, *subquerysets):
        self.querysets = subquerysets

    def count(self):
        """
        Performs a .count() for all subquerysets and returns the number of
        records as an integer.
        """
        return sum(qs.count() for qs in self.querysets)

    def _clone(self):
        "Returns a clone of this queryset chain"
        return self.__class__(*self.querysets)

    def _all(self):
        "Iterates records in all subquerysets"
        return chain(*self.querysets)

    def __getitem__(self, ndx):
        """
        Retrieves an item or slice from the chained set of results from all
        subquerysets.
        """
        if type(ndx) is slice:
            return list(islice(self._all(), ndx.start, ndx.stop, ndx.step or 1))
        else:
            return islice(self._all(), ndx, ndx+1).next()

In your example, the usage would be:

pages = Page.objects.filter(Q(title__icontains=cleaned_search_term) |
                            Q(body__icontains=cleaned_search_term))
articles = Article.objects.filter(Q(title__icontains=cleaned_search_term) |
                                  Q(body__icontains=cleaned_search_term) |
                                  Q(tags__icontains=cleaned_search_term))
posts = Post.objects.filter(Q(title__icontains=cleaned_search_term) |
                            Q(body__icontains=cleaned_search_term) | 
                            Q(tags__icontains=cleaned_search_term))
matches = QuerySetChain(pages, articles, posts)

Then use matches with the paginator like you used result_list in your example.

The itertools module was introduced in Python 2.3, so it should be available in all Python versions Django runs on.

查看更多
栀子花@的思念
3楼-- · 2018-12-31 03:57

Looks like t_rybik has created a comprehensive solution at http://www.djangosnippets.org/snippets/1933/

查看更多
无色无味的生活
4楼-- · 2018-12-31 04:00

here's an idea... just pull down one full page of results from each of the three and then throw out the 20 least useful ones... this eliminates the large querysets and that way you only sacrifice a little performance instead of a lot

查看更多
旧人旧事旧时光
5楼-- · 2018-12-31 04:05

Concatenating the querysets into a list is the simplest approach. If the database will be hit for all querysets anyway (e.g. because the result needs to be sorted), this won't add further cost.

from itertools import chain
result_list = list(chain(page_list, article_list, post_list))

Using itertools.chain is faster than looping each list and appending elements one by one, since itertools is implemented in C. It also consumes less memory than converting each queryset into a list before concatenating.

Now it's possible to sort the resulting list e.g. by date (as requested in hasen j's comment to another answer). The sorted() function conveniently accepts a generator and returns a list:

result_list = sorted(
    chain(page_list, article_list, post_list),
    key=lambda instance: instance.date_created)

If you're using Python 2.4 or later, you can use attrgetter instead of a lambda. I remember reading about it being faster, but I didn't see a noticeable speed difference for a million item list.

from operator import attrgetter
result_list = sorted(
    chain(page_list, article_list, post_list),
    key=attrgetter('date_created'))
查看更多
ら面具成の殇う
6楼-- · 2018-12-31 04:05
DATE_FIELD_MAPPING = {
    Model1: 'date',
    Model2: 'pubdate',
}

def my_key_func(obj):
    return getattr(obj, DATE_FIELD_MAPPING[type(obj)])

And then sorted(chain(Model1.objects.all(), Model2.objects.all()), key=my_key_func)

Quoted from https://groups.google.com/forum/#!topic/django-users/6wUNuJa4jVw. See Alex Gaynor

查看更多
登录 后发表回答