In firebase, is modeling many-many relationships u

2020-07-18 07:41发布

问题:

Suppose I have a typical users & groups data model where a user can be in many groups and a group can have many users. It seems to me that the firebase docs recommend that I model my data by replicating user ids inside groups and group ids inside users like this:

{
  "usergroups": {
    "bob": {
      "groups": {
        "one": true,
        "two": true
       }
    },
    "fred": {
      "groups": {
        "one": true
      }
    }
  },
  "groupusers": {
    "one": {
      "users": {
        "bob": true,
        "fred": true
      }
    },
    "two": {
      "users": {
        "bob": true
      }
    }
  }
}

In order to maintain this structure, whenever my app updates one side of the relationship (e.g., adds a user to a group), it also needs to update the other side of the relationship (e.g., add the group to the user).

I'm concerned that eventually someone's computer will crash in the middle of an update or something else will go wrong and the two sides of the relationship will get out of sync. Ideally I'd like to put the updates inside a transaction so that either both sides get updated or neither side does, but as far as I can tell I can't do that with the current transaction support in firebase.

Another approach would be to use the upcoming firebase triggers to update the other side of the relationship, but triggers are not available yet and it seems like a pretty heavyweight solution to post a message to an external server just to have that server keep redundant data up to date.

So I'm thinking about another approach where the many-many user-group memberships are stored as a separate endpoint:

{
  "memberships": {
    "id1": {
      "user": "bob",
      "group": "one"
    },
    "id2": {
      "user": "bob",
      "group": "two"
    },
    "id3": {
      "user": "fred",
      "group": "one"
    }
  }
}      

I can add indexes on "user" and "group", and issue firebase queries ".orderByChild("user").equalTo(...)" and ".orderByChild("group").equalTo(...)" to determine the groups for a particular user and the users for a particular group respectively.

What are the downsides to this approach? We no longer have to maintain redundant data, so why is this not the recommended approach? Is it significantly slower than the recommended replicate-the-data approach?

回答1:

In the design you propose you'd always need to access three locations to show a user and her groups:

  1. the users child to determine the properties of the user
  2. the memberships to determine what groups she's a member of
  3. the groups child to determine the properties of the group

In the denormalized example from the documentation, your code would only need to access #1 and #3, since the membership information is embedded into both users and groups.

If you denormalize one step further, you'd end up storing all relevant group information for each user and all relevant user information for each group. With such a data structure, you'd only need to read a single location to show all information for a group or a user.

Redundancy is not necessarily a bad thing in a NoSQL database, indeed precisely because it speeds things up.

For the moment I would go with a secondary process that periodically scans the data and reconciles any irregular data it finds. Of course that also means that regular client code needs to be robust enough to handle such irregular data (e.g. a group that points to a user, where that user's record doesn't point to the group).

Alternatively you could set up some advanced .validate rules that ensure the two sides are always in sync. I've just always found that takes more time to implement, so never bothered.

You might also want to read this answer: Firebase data structure and url



标签: firebase