What are the pros and cons of using GenericForeign

2020-05-24 06:24发布

问题:

Context

I am in the process of modeling my data using Django models. The main model is an Article. It holds the actual content.

Then each Article must be attached to a group of articles. Those group may be a Blog, a Category a Portfolio or a Story. Every Article must be attached to one, and exactly one of those. That is, either a blog, a category or a story. Those models have very different fields and features.

I thought of three ways to reach that goal (and a bonus one that really looks wrong).

Option #1: A generic foreign key

As in django.contrib.contenttypes.fields.GenericForeignKey. It would look like this:

class Category(Model):
    # some fields

class Blog(Model):
    # some fields

class Article(Model):
    group_type = ForeignKey(ContentType)
    group_id = PositiveIntegerField()
    group = GenericForeignKey('group_type', 'group_id')
    # some fields

On the database side, that means no relation actually exists between the models, they are enforced by Django.

Option #2: Multitable inheritance

Make article groups all inherit from an ArticleGroup model. This would look like this:

class ArticleGroup(Model):
    group_type = ForeignKey(ContentType)

class Category(ArticleGroup):
    # some fields

class Blog(ArticleGroup):
    # some fields

class Article(Model):
    group = ForeignKey(ArticleGroup)
    # some fields

On the database side, this creates an additional table for ArticleGroup, then Category and Blog have an implicit foreign key to that table as their primary key.

Sidenote: I know there is a package that automates the bookkeeping of such constructions.

Option #3: manual OneToOneFields

On the database side, it is equivalent to option #2. But in the code, all relations are made explicit:

class ArticleGroup(Model):
    group_type = ForeignKey(ContentType)

class Category(Model):
    id = OneToOneField(ArticleGroup, primary_key=True)
    # some fields

class Blog(Model):
    id = OneToOneField(ArticleGroup, primary_key=True)
    # some fields

class Article(Model):
    group = ForeignKey(ArticleGroup)
    # some fields

I don't really see what the point of that would be, apart from making explicit what Django's inheritance magic implicitly does.

Bonus: multicolumn

It seems pretty dirty so I just add it as a bonus, but it would also be possible to define a nullable ForeignKey to each of Category, Blog, ... directly on the Article model.

So...

...I cannot really decide between those. What are the pros and cons of each approach? Are there some best practices? Did I miss a better approach?

If that matters, I'm using Django 1.8.

回答1:

It seems noone had advice to share on that one. I eventually chose the multicolumn option, despite having said it looked ugly. It all came down to 3 things:

  • Database-based enforceability.
  • The way Django ORM works with the different constructs.
  • My own needs (namely, collection queries on the group to get the item list, and individual queries on the items to get the group).

Option #1

  • Cannot be enforced at the database level.
  • Could be efficient on queries because the way it is constructed does not fall into usual generic foreign key pitfalls. Those happen when the items are generic, not the collections.
  • However, due to how the ORM handles GFK, it is impossible to use a custom manager, which I need because my articles are translated using django-hvad.

Option #2

  • Can be enforced at the database level.
  • Could be somewhat efficient, but runs into ORM limitations, which is clearly not built around this use. Unless I use extra() or custom queries alot, but at some point there is no reason to use an ORM anymore.

Option #3

  • Would actually be a bit better than #2, as making things explicit allows easier query optimisation while using the ORM.

Multicolumn

  • Turns out not being so bad. It can be enforced at the database level (FK constraints plus a manual CHECK to ensure only one of the columns is non-null).
  • Easy and efficient. A single intuitive query does the job: select_related('category', 'blog', ...).
  • Though it does have the issue of being harder to extend (any new type will require altering the Article's table as well) and limiting the possible number of types, I'm unlikely to run into those.

Hope it helps anyone with the same dilemma, and still interested in hearing other opinions.