I've seen SaaS applications hosted in many different ways. Is it a good idea to split features and modules across multiple databases? For example, putting things like the User table on one DB and feature/app specific tables on another DB and perhaps other commonly shared tables in another DB?
相关问题
- How to better organise database to account for cha
- Shared common definitions across C/C++ (unmanaged)
- Database best practices - Status
- Designing lossless-join, dependency preserving, 3N
- Use of empty strings instead of null - a good prac
Why to use database at all ?
I think it's good idea to use distributed storage systems like Hadoop, Voldemort (project-voldemort.com developed and used by LinkedIn).
I think db good for sensetive data like money operations , but for everything else you can use distributed storages.
Keep it a natural design (denormalize as much as needed, normalize as less as required). Split the DB Model into its modules and keep the service oriented principles in mind by fronting data with a service (that owns the data).
High Scalability is a good blog for scaling SaaS applications. As mentioned, splitting tables across databases as you suggested is generally a bad idea. But a similar concept is sharding, where you keep the same (or similar) schema, but split the data on multiple servers. For example, users 1-5000 are on server1, and users 5000-10000 on server2. Depending on the queries your application uses, it can be an efficient way to scale.
Splitting the database by features might not be a good idea unless you see strong evidence suggesting the need. Often you might need to update two databases as part of a single transactions - and distributed transactions are much more harder to work with. Furthermore, if the database needs to be split, you might be able to employ sharding.
Ask yourself: What do you gain by moving everything into separate databases?
A lot of pain in terms of management would be my guess. I'd be more keen personally to have everything in a single database and if you hit issues that cannot be solved by a single database later then migrate the data into multiple databases.
Having a single database is best for data integrity because then you can use foreign keys. You can't have this built-in data integrity if you split the data into multiple databases. This isn't an issue if your data isn't related, but if it is related, it would be possible for your one database to contain data that is inconsistent with another database. In this case, you would need to write some code that scans your databases for inconsistent data on a regular basis so you can handle it appropriately.
However, multiple databases may be necessary if you need your site/application to be highly scalable (e.g. internet scale). For example, you could host each database on a different physical server.