According to DynamoDB doc: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-general-nosql-design.html
"You should maintain as few tables as possible in a DynamoDB application. Most well designed applications require only one table."
But according to my experience you always have to do the opposite thing due to partition key design.
Let's consider the next situation. We have several user roles, for example, "admin", "manager", "worker". Usual workflow of an admin is to CRUD manager data, where read operation is to get not one manager but all manager list. The same is for the manager - he CRUDs worker data. We have only two scenarios of key usage for both cases:
- get a list of all items (item key doesn't matter)
- work with a particular item using its full key.
Naturally we should have uniformly distributed partition key (as the doc emphasises) so we can't select user role for it and should use user id. Since we already have as partition key some random id, we don't need sort key at all since it simply doesn't work - we already access exectly one user by only using the partition key part. At this point we realize that user id is working like a charm for CUD operations but for every R operation we need to scan all the table and then filter the result by user role which is ineffective. How can this be improved? Very naturally - let's just have own table for each user type! Then we will scan for manager list from admin API and for worker list from the manager one.
I use DynamoDB almost for a year and still can't get it. For me the reality is that for real life scenarios sort key is something that you can never use (the only real case for it I had was to access items like "agreements" that belong to the two users of different types the same time, so the primary key was { partion: "managerId", sort: "userId" } and secondary global index was { partition: "userId", sort: "managerId" } so I could effectively query for all particualar manager agreement list or all particular user agreement list providing only corresponding manger or user id for the query. The approach is discussed in doc here: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-adjacency-graphs.html).
I feel that I don't understand the concept at all. What can be an effective way of key schema for provided example to use only one DynamoDB table for both user types?
It sounds like what you need in this case is a Global Secondary Index (https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSI.html) where the partition key is the user role. That way, you can query all users with a particular role through that
UserRoleIndex
and, with the help of a sort key on the user ID, single out one particular user within that role.Alternatively, if you are starting from scratch with a new table, you might not even need an index (unless you don't know the role of a user when you delete them). You can use a "composite primary key" (https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.CoreComponents.html#HowItWorks.CoreComponents.PrimaryKey) where the partition key and the sort key would be the same as in the index I am suggesting above.
Using the same notation that you used in your question, I would recommend
{ partition: "userRole", sort: "userId" }
.DynamoDB can be hard to understand sometimes and there definitively are cases where a traditional SQL database makes more sense. This video from AWS re:Invent 2018 is great to understand the difference between the two: https://www.youtube.com/watch?v=HaEPXoXVf2k&feature=youtu.be.
In your case, though, it looks like you have a very clear access pattern, so DDB would work for you.
you can have a schema like
where
This way, you can read and get all managers' list by querying the GSI
With GSI, DynamoDb creates another table and maintains it ,so you don't need to maintain multiple tables.
let me know if you have any questions