How to denormalize/normalize data structure for fi

2019-01-19 14:50发布

问题:

I am trying to wrap my head around how to structure my data for firebase realtime database. I read the docs and some other questions on SO finding following advices:

  • data should be as flat as possible
  • be expensive on writes to have cheap reads
  • avoid nesting data
  • duplication of data may be okay

Keeping this in mind let me describe my specific use case. Frist there is user with following attributes:

  • Firstname
  • Lastname
  • Profile picture (big)
  • Profile picture (small)

A user may create a story, that consist of following attributes:

  • User
  • Text
  • Timestamp

The visual representation of a story may look like this:

My question is, how would you associate user information (firstname, lastname, small profile picture) with a story?

What I thought about:

  1. put a user_id in the story that contains the foreign id to the specific user. To load the story we would have to make two request to the database, one to get the story and one for the user.

    { user_id : 'XYZ', text: 'foobar', timestamp: ... }

  2. put firstname, lastname and small profile picture in the story. Only one request would be necessary to display the story. But we would have to update each user's story, when e.g. the profile picture changes.

    { user_id : 'XYZ', firstname: 'sandra', lastname: 'adams', smallProfilePicutre: '...', text: 'foobar', timestamp: ... }

So when there are few stories created and most of the time there are just reads, approach 1. would be expensive, because we pay for two reads to display a story. Approach 2. would be more cost efficient.

I would like to here your thoughts and ideas on this.

回答1:

I'm with Jay here: you pretty much got all of it in your question already. Great summary of the practices we recommend when using Firebase Database.

Your questions boils down to: should I duplicate my user profile information into each story? Unfortunately there's no single answer for that.

Most developers I see will keep the profile information separate and just keep the user UID in the post as a unmanaged foreign key. This has the advantage of needing to update the user profile in only one place when it changes. The performance to read a single story is not too bad: the two reads are relatively fast, since they go over the same connection. When you're showing a list of stories, it is unexpectedly fast since Firebase pipelines the requests over its single connection.

But one of the first bigger implementation I helped with actually duplicated the user data over the stories. As you said: reading a story or list of stories is as fast as it can be in that case. When asked how they dealt with keeping the user information up to date in the stories (see strategies here), they admitted they didn't. In fact: they argued many good reasons why they needed the historical user information for each story.

In the end, it all depends on your use-case. You'll need to answer questions such as:

  • Do you need the historical information for each user?
  • Is it crucial that you show the up-to-date information for a user in older posts?
  • Can you come up with a good caching strategy for the user profiles in your client-side code?