Optimizing database queries in Django REST framewo

2020-01-24 02:49发布

I have the following models:

class User(models.Model):
    name = models.Charfield()
    email = models.EmailField()

class Friendship(models.Model):
    from_friend = models.ForeignKey(User)
    to_friend = models.ForeignKey(User)

And those models are used in the following view and serializer:

class GetAllUsers(generics.ListAPIView):
    authentication_classes = (SessionAuthentication, TokenAuthentication)
    permission_classes = (permissions.IsAuthenticated,)
    serializer_class = GetAllUsersSerializer
    model = User

    def get_queryset(self):
        return User.objects.all()

class GetAllUsersSerializer(serializers.ModelSerializer):

    is_friend_already = serializers.SerializerMethodField('get_is_friend_already')

    class Meta:
        model = User
        fields = ('id', 'name', 'email', 'is_friend_already',)

    def get_is_friend_already(self, obj):
        request = self.context.get('request', None)

        if request.user != obj and Friendship.objects.filter(from_friend = user):
            return True
        else:
            return False

So basically, for each user returned by the GetAllUsers view, I want to print out whether the user is a friend with the requester (actually I should check both from_ and to_friend, but does not matter for the question in point)

What I see is that for N users in database, there is 1 query for getting all N users, and then 1xN queries in the serializer's get_is_friend_already

Is there a way to avoid this in the rest-framework way? Maybe something like passing a select_related included query to the serializer that has the relevant Friendship rows?

4条回答
▲ chillily
2楼-- · 2020-01-24 02:57

Using this metaclass DRF optimize ModelViewSet MetaClass

from django.utils import six

@six.add_metaclass(OptimizeRelatedModelViewSetMetaclass)
class MyModelViewSet(viewsets.ModelViewSet):
    queryset = MyModel.objects.all()
    serializer_class = MyModelSerializer
查看更多
太酷不给撩
3楼-- · 2020-01-24 03:01

Django REST Framework cannot automatically optimize queries for you, in the same way that Django itself won't. There are places you can look at for tips, including the Django documentation. It has been mentioned that Django REST Framework should automatically, though there are some challenges associated with that.

This question is very specific to your case, where you are using a custom SerializerMethodField that makes a request for each object that is returned. Because you are making a new request (using the Friends.objects manager), it is very difficult to optimize the query.

You can make the problem better though, by not creating a new queryset and instead getting the friend count from other places. This will require a backwards relation to be created on the Friendship model, most likely through the related_name parameter on the field, so you can prefetch all of the Friendship objects. But this is only useful if you need the full objects, and not just a count of the objects.

This would result in a view and serializer similar to the following:

class Friendship(models.Model):
    from_friend = models.ForeignKey(User, related_name="friends")
    to_friend = models.ForeignKey(User)

class GetAllUsers(generics.ListAPIView):
    ...

    def get_queryset(self):
        return User.objects.all().prefetch_related("friends")

class GetAllUsersSerializer(serializers.ModelSerializer):
    ...

    def get_is_friend_already(self, obj):
        request = self.context.get('request', None)

        friends = set(friend.from_friend_id for friend in obj.friends)

        if request.user != obj and request.user.id in friends:
            return True
        else:
            return False

If you just need a count of the objects (similar to using queryset.count() or queryset.exists()), you can include annotate the rows in the queryset with the counts of reverse relationships. This would be done in your get_queryset method, by adding .annotate(friends_count=Count("friends")) to the end (if the related_name was friends), which will set the friends_count attribute on each object to the number of friends.

This would result in a view and serializer similar to the following:

class Friendship(models.Model):
    from_friend = models.ForeignKey(User, related_name="friends")
    to_friend = models.ForeignKey(User)

class GetAllUsers(generics.ListAPIView):
    ...

    def get_queryset(self):
        from django.db.models import Count

        return User.objects.all().annotate(friends_count=Count("friends"))

class GetAllUsersSerializer(serializers.ModelSerializer):
    ...

    def get_is_friend_already(self, obj):
        request = self.context.get('request', None)

        if request.user != obj and obj.friends_count > 0:
            return True
        else:
            return False

Both of these solutions will avoid N+1 queries, but the one you pick depends on what you are trying to achieve.

查看更多
冷血范
4楼-- · 2020-01-24 03:07

You can split the view into two query.
First, only get the Users list (without is_friend_already field). This only require one query.
Second, get the friends list of request.user.
Third, modify the results depending on if the user is in the request.user's friend list.

class GetAllUsersSerializer(serializers.ModelSerializer):
    ... 


class UserListView(ListView):
    def get(self, request):
        friends = request.user.friends
        data = []
        for user in self.get_queryset():
            user_data = GetAllUsersSerializer(user).data
            if user in friends:
                user_data['is_friend_already'] = True
            else:
                user_data['is_friend_already'] = False
            data.append(user_data)
        return Response(status=200, data=data)
查看更多
爷的心禁止访问
5楼-- · 2020-01-24 03:15

Described N+1 problem is a number one issue during Django REST Framework performance optimization, so from various opinions, it requires more solid approach then direct prefetch_related() or select_related() in get_queryset() view method.

Based on collected information, here's a robust solution that eliminates N+1 (using OP's code as an example). It's based on decorators and slightly less coupled for larger applications.

Serializer:

class GetAllUsersSerializer(serializers.ModelSerializer):
    friends = FriendSerializer(read_only=True, many=True)

    # ...

    @staticmethod
    def setup_eager_loading(queryset):
        queryset = queryset.prefetch_related("friends")

        return queryset

Here we use static class method to build the specific queryset.

Decorator:

def setup_eager_loading(get_queryset):
    def decorator(self):
        queryset = get_queryset(self)
        queryset = self.get_serializer_class().setup_eager_loading(queryset)
        return queryset

    return decorator

This function modifies returned queryset in order to fetch related records for a model as defined in setup_eager_loading serializer method.

View:

class GetAllUsers(generics.ListAPIView):
    serializer_class = GetAllUsersSerializer

    @setup_eager_loading
    def get_queryset(self):
        return User.objects.all()

This pattern may look like an overkill, but it's certainly more DRY and has advantage over direct queryset modification inside views, as it allows more control over related entities and eliminates unnecessary nesting of related objects.

查看更多
登录 后发表回答