Assume that I want to write a blogging app. Should I prefer one of the following two options? I would prefer to have as much "single source of truth" as possible, but I am still not sure whether that preference comes from my background in SQL.
Option 1 (Denormalization):
Posts: {
post_1: {
title: "hello",
body: "hi there!",
uid: "user_1",
comments: {
comment_1: {
body: "hi I commented",
uid: "user_2",
},
comment_2: {
body: "bye I commented",
uid: "user_2",
},
}
}
}
Users: {
user_1: {
uid: "user_1",
post_1: {
title: "hello",
body: "hi there!",
uid: "user_1",
comments: {
comment_1: {
body: "hi I commented",
uid: "user_2",
},
comment_2: {
body: "bye I commented",
uid: "user_2",
},
}
}
}
}
Option 2 (Indexing):
Posts: {
post_1: {
title: "hello",
body: "hi there!",
uid: "user_1",
authorName: "Richard",
comments: {
comment_1: true,
comment_2: true
}
}
}
Users: {
user_1: {
uid: "user_1",
displayName: "Richard",
email: "richard@gmail.com",
posts: {
post_1: true
},
comments: {
comment_1: true,
comment_2: true
}
}
}
Comments: {
comment_1: {
body: "hi I commented",
uid: "user_1",
},
comment_2: {
body: "bye I commented",
uid: "user_1",
},
}
I think I should prefer option 2.
The main problem that I see with option 1 is that there are too many sources for one data. Let's say I want to extend the app so each post belongs to a certain category or tag. Then, I will have to write a post
object under /categories/category_id
in addition to /posts
and /users/uid
. When the post gets updated, I have to remember to modify the post
object in three different places. If I go with option 2, I don't have this problem because there's only one source for data.
Am I missing anything?
References:
The second option is better because otherwise you would be forcing the user to download all the comments and posts (that can be a lot).
You can check in the documentation here.
And you can handle the duplication doing atomic writes across multiple locations.