ndb modelling one-to-many: merits of repeated KeyP

2019-03-11 08:24发布

My question is about modelling one-to-many relations in ndb. I understand that this can be done in (at least) two different ways: with a repeated property or with a 'foreign key'. I have created a small example below. Basically we have an Article which can have an arbitrary number of Tags. Let's assume that a Tag can be removed but cannot be changed after it has been added. Let's also assume that we don't worry about transactional safety.

My question is: what is the preferred way of modelling these relationships?

My considerations:

  • Approach (A) requires two writes for every tag that is added to an article (one for the Article and one for the Tag) whereas approach (B) only requires one write (just the Tag).
  • Approach (A) leverages ndb's caching mechanism when fetching all Tags for an Article whereas in case of approach (B) a query is required (and additionally some custom caching)

Are there some things that I'm missing here, any other considerations that should be taken into account?

Thanks very much for your help.

Example (A):

class Article(ndb.Model):
    title = ndb.StringProperty()
    # some more properties
    tags = ndb.KeyProperty(kind="Tag", repeated=True)

    def create_tag(self):
        # requires two writes
        tag = Tag(name="my_tag")
        tag.put()
        self.tags.append(tag)
        self.put()

    def get_tags(self):
        return ndb.get_multi(self.tags)

class Tag(ndb.Model):
    name = ndb.StringProperty()
    user = ndb.KeyProperty(Kind="User") #  User that created the tag
    # some more properties

Example(B):

class Article(ndb.Model):
    title = ndb.StringProperty()
    # some more properties

    def create_tag(self):
        # requires one write
        tag = Tag(name="my_tag", article=self.key)
        tag.put()

    def get_tags(self):
        # obviously we could cache this query in memcache
        return Tag.gql("WHERE article :1", self.key)

class Tag(ndb.Model):
    name = ndb.StringProperty()
    article = ndb.KeyProperty(kind="Article")
    user = ndb.KeyProperty(Kind="User") #  User that created the tag
    # some more properties

3条回答
Summer. ? 凉城
2楼-- · 2019-03-11 08:43

Approach (A) should be preferred in most situations. While there are two writes required to add a tag, this is probably much less frequent than reading the tags. As long as you don't have a huge number of tags, they should all fit into the repeated Key property.

As you mentioned, fetching the tags by their keys is much faster than performing a query. Also, if you only need the tag's name and the user, you could create the tag with the User as the parent key and the Name as the tag's id:

User -> Name -> Tag

To create this tag, you would use:

tag = Tag(id=name, parent=user, ...)
article.tags.push(tag)
ndb.put_multi([tag, article])

Then when you retrieve the tags,

for tag in article.tags:
    user = tag.parent()
    name = tag.id()

Then, each key you stored in Article.tags would contain the User key and the Tag name! This would save you from reading in the Tag to get those values.

查看更多
贼婆χ
3楼-- · 2019-03-11 08:55

Have you looked at the following about using Structured Properties https://developers.google.com/appengine/docs/python/ndb/properties#structured . The short discussion there about Contact and Addresse may simplify your problem. Also look at https://developers.google.com/appengine/docs/python/ndb/queries#filtering_structured_properties. The discussions are very short.

Also, looking ahead to the fact that joins are not allowed, option A looks better.

查看更多
爷的心禁止访问
4楼-- · 2019-03-11 09:01

As stated before, there are no joins in Datastore, so all the "Foreign Key" notion doesn't apply. What can be done is to use the Query class to query your datastore for the correct Tag.

For example, if you are using Endpoints, then:

class Tag(ndb.model):
    user = ndb.UserProperty()

And the during the request do:

query.filter(Tag.user == endpoints.get_current_user())
查看更多
登录 后发表回答