Django ORM: caching and manipulating ForeignKey ob

2020-05-27 11:47发布

问题:

Consider the following skeleton of a models.py for a space conquest game:

class Fleet(models.Model):
    game = models.ForeignKey(Game, related_name='planet_set')
    owner = models.ForeignKey(User, related_name='planet_set', null=True, blank=True)
    home = models.ForeignKey(Planet, related_name='departing_fleet_set')
    dest = models.ForeignKey(Planet, related_name='arriving_fleet_set')
    ships = models.IntegerField()

class Planet(models.Model):
    game = models.ForeignKey(Game, related_name='planet_set')
    owner = models.ForeignKey(User, related_name='planet_set', null=True, blank=True)
    name = models.CharField(max_length=250)
    ships = models.IntegerField()

I have many such data models for a project I'm working on, and I change the state of the game based on somewhat complicated interactions between various data objects. I want to avoid lots of unnecessary calls to the database, so once per turn, I do something like

  1. Query all the fleets, planets, and other objects from the database and cache them as python objects
  2. Process the game objects, resolving the state of the game
  3. Save them back in the database

This model seems to totally break down when using ForeignKey objects. For example, when a new fleet departs a planet, I have a line that looks something like this:

fleet.home.ships -= fleet.ships

After this line runs, I have other code that alters the number of ships at each of the planets, including the planet fleet.home. Unfortunately, the changes made in the above line are not reflected in the QuerySet of planets that I obtained earlier, so that when I save all the planets at the end of the turn, the changes to fleet.home's ships get overwritten.

Is there some better way of handling this situation? Or is this just how all ORMs are?

回答1:

Django's ORM does not implement an identity map (it's in the ticket tracker, but it isn't clear if or when it will be implemented; at least one core Django committer has expressed opposition to it). This means that if you arrive at the same database object through two different query paths, you are working with different Python objects in memory.

This means that your design (load everything into memory at once, modify a lot of things, then save it all back at the end) is unworkable using the Django ORM. First because it will often waste lots of memory loading in duplicate copies of the same object, and second because of "overwriting" issues like the one you're running into.

You either need to rework your design to avoid these issues (either be careful to work with only one QuerySet at a time, saving anything modified before you make another query; or if you load several queries, look up all relations manually, don't ever traverse ForeignKeys using the convenient attributes for them), or use an alternative Python ORM that implements identity map. SQLAlchemy is one option.

Note that this doesn't mean Django's ORM is "bad." It's optimized for the case of web applications, where these kinds of issues are rare (I've done web development with Django for years and never once had this problem on a real project). If your use case is different, you may want to choose a different ORM.



回答2:

This is perhaps what you are looking for:

https://web.archive.org/web/20121126091406/http://simonwillison.net/2009/May/7/mmalones/