I'm weighing up having separate DBs (one per company) vs one multi-tenanted DB (with all companies). Criteria:
- A user can belong to one company only and can't access documents of other companies.
- An administrator of the system needs to maintain DBs for all firms.
- Number of companies/tenants - from hundreds to tens of thousands
- There is one entry point with authentication, for all companies/tenants (it'll resolve the tenant and address it to the right DB).
Question #1. Are there any "good practices" for designing a multi-tenanted database in RavenDB?
There is a similar post for MongoDB. Would it be the same for RavenDB? More records will affect indexes, but would it potentially make some tenants suffer from active usage of an index by other tenants?
If I were to design a multi-tenanted DB for RavenDB, then I see the implementation as
- have a Tag per Company/Tenant, so all users of one company have permission to the company tag and all top-level documents have the tag (see KB on Auth Bundle)
- have a Tenant ID tag as a prefix for each Document ID (due to the official recommendation to use sequential identifiers and I'm happy with generating IDs on the server)
Question #2.1. Is tagging the best way to utilise the Authorization Bundle for resolving users' permissions and prevent accessing documents of other tenants?
Question #2.2. How important is to have the Tenant ID in the ID prefix of top-level documents? I guess, the main consideration here is performance once permissions gets resolved via tags or I'm missing something?
If you are going to have a few hundreds companies, then a db per company is fine. If you are going to have tens of thousands, then you want to put it all in a single db.
A db can consume non trivial amount of resources, and having a LOT of them can be a lot more expensive than a single larger db.
I would recommend not using the authorization bundle, it requires us to do an
O(N)
filtering. It is better to addTenantId = XYZ
in the query directly, maybe through a query listener.Don't worry too much about sequential identifiers. They have an impact, but they aren't THAT important unless you are generating tens of thousands per second.
See an example of the listeners to handle multi-tenancy.
A query listener to add the current Tenant ID to all queries (filter out entries from other tenants):
A store listener to set the current Tenant ID to all tenanted entities:
The interface, implemented by top-level entities supporting multi-tenancy:
My attempt to engage @AyendeRahien in a discussion of the technical implementation by editing his post was unsuccessful :), so below I'll address my concerns from the above:
1. Multi-tenanted DB vs multiple DBs
Here are some Ayende's thoughts on multi-tenancy in general.
In my view the question boils down to
Simply, in a case of a couple of tenants with a huge number of records, adding the tenant information into the indexes will unnecessary increase the index size and handling the tenant ID will bring some overhead you'd rather avoid, so go for two DBs then.
2. Design of multi-tenanted DB
Step #1. Add
TenantId
property to all persistent documents you want to support multi-tenancy.Step #2. Implement facade for the Raven's session (
IDocumentSession
orIAsyncDocumentSession
) to take care of multi-tenanted entities.Sample code below:
The code above may need some love if you need
Include()
as well.My final solution doesn't use listeners for RavenDb v3.x as I suggested earlier (see my comment on why) or events for RavenDb v4 (because it's hard to modify the query in there).
Of course, if you write patches of JavaScript functions you'd have have to handle multi-tenancy manually.