SaaS / Multi-Tenancy approaches for Java-based (GW

2019-03-07 17:41发布

问题:

I am currently looking into converting a single-tenant Java based web-app that uses Spring, GWT, Hibernate, Jackrabbit, Hibernate Search / Lucene (among others) into a fully fledged SaaS style app.

I stumbled across an article that highlights the following 7 "things" as important changes to make to a single tenant app to make it an SaaS app:

  1. The application must support multi-tenancy.
  2. The application must have some level of self-service sign-up.
  3. There must be a subscription/billing mechanism in place.
  4. The application must be able to scale efficiently.
  5. There must be functions in place to monitor, configure, and manage the application and tenants.
  6. There must be a mechanism in place to support unique user identification and authentication.
  7. There must be a mechanism in place to support some level of customization for each tenant.

My question is has anyone implemented any of the above 7 things in a SaaS /multi-tenant app using similar technologies to those that I have listed? I am keen to get as much input regarding the best ways to do so before I go down the path that I am currently considering.

As a start I am quite sure that I have a good handle on how to handle multiple tenants at a model level. I am thinking of adding a tenant ID to all of our tables and then using a Hibernate filter (and a Full Text Filter for Hibernate Search) to filter based on the logged on user's tenant ID for all queries.

I do however have some concerns around performance as well especially when our number of tenants grows quite high.

Any suggestions on how to implement such a solution will be greatly appreciated (and I apologise if this question is a bit too open-ended).

回答1:

I would recommend that you architect your application to support all the 4 types of tenant isolation namely separate database for each tenant, separate schema for each tenant, separate table for each tenant and shared table for all tenants with a tenant ID. This will give you the flexibility to horizontally partition your database as you grow, having multiple databases each having a group of smaller tenants and also the ability to have a separate database for some large tenants. Some of your large tenants could also insist that their data (database) should reside in their premise, while the application can run off the cloud.

Here is an exaustive check list of non-functional and infrastructure level features that you may want to consider while architecting your application (some of them you may not need immediately, but think of a business situation of how you will handle such a need if your competition starts offering it)

  1. tenant level customization of a) UI themes and logos b) forms and grids, c) data model extensions and custom fields, d) notification templates, e) pick up lists and master data
  2. tenant level creation and administration of roles and privileges, field level access permissions, data scope policies
  3. tenant level access control settings for modules and features, so that specific modules and features could be enabled / disabled depending on the subscription package.
  4. Metering and monitoring of tasks / events / transactions and restriction of access control once the purchased quota is exceeded. The ability to meter any new entity in the future if and when your business model changes.
  5. Externalising the business rules and workflows out of your code base and representing them as meta data, so that you can customize them for each tenant group / tenant.
  6. Query builder for creating custom reports that is aware of the tenant as well as custom fields added by specific tenants.
  7. Tenant encapsulation and framework level connection string management such that your developers do not have to worry about tenant IDs while writing queries.

All these are based on our experience in building a general purpose multi-tenant framework that can be used for any domain or application. Unfortunately, you cannot use our framework as it is based on .NET

But the engineering needs of any multi-tenant SaaS product (new or migrated) are the same irrespective of the technology stack that you use.



回答2:

All of the technologies that you listed are quite common and reasonable for both single- and multi-tenant applications. I'd say supporting the 7 "things" for SaaS is much more of a function of how you use the technologies than which. It sounds like you already have a single-tenant application that works. So there's probably not much reason to deviate from the technology selections there unless something is just not working very well already. Your question is otherwise fairly open-ended though, so it's hard to be too much more specific there.

I do have some feedback on splitting the database (and perhaps other things) by tenant ID though. If you know you might eventually have a lot of tenants (say many thousands or more, particularly if they're small) then what you suggest is perhaps best. If however you'll have a smaller number of tenants (particularly if they're large) you might want to consider a database per tenant, so they each have their own table space. By that I mean a single database installation with multiple instances of the same schema inside of it, one per tenant.

There are a few reasons this can be an advantage. One is performance as you mentioned. Adding a tenant ID to every single table is overhead on disk access, query time and increases code complexity. Every index in the database will need to include the tenant ID as well. You run an additional risk of mixing data between tenants if you're not careful (although a Hibernate filter would help mitigate that). With a database per tenant you could restrict access to only the correct one. Porting your current application will probably be a lot easier too, you basically just need to intercept your request somewhere early to decide the tenant based on the URL and point to the right database. Backups are also easy to do per tenant, particularly useful if you ever intend on allowing them to download a backup.

On the other hand there are reasons not to do this. You'll have a lot of database schemas to deal with and they'll have to be updated independently (which can actually be an advantage if you want to avoid taking all tenants down for a schema change, you can roll them out incrementally). It lets you have special cases that could deviate from treating the platform as a true multi-tenant SaaS deployment that's upgraded all at once, resulting in management of multiple versions in production. Lastly I've heard there is a breaking point with just about every database vendor out there in the number of schema instances they'll support in one installation (supposedly some can go to hundreds of thousands though).

It really depends on your use case of course. You mentioned single-tenant which leads me to believe you don't have too many tenants right now, however you do mention growing to lots of tenants. I'm not sure if you mean hundreds or millions, yet either way I hope this helps some with your considerations. Best of luck!



回答3:

There is no simple answer. I can describe my own solution. It may serve as an inspiration for the others.

  • tenant per database (postgres)
  • one additional database shared between tenants
  • Spring + MyBatis
  • Spring Security authentication

Details here: http://blog.trixi.cz/2012/01/multitenancy-using-spring-and-postgresql/



回答4:

For (1): Hibernate supporting multi-tenant configurations out of the box from version 4. At the moment of writing supported are DB-per-tenant and schema-per-tenant and keeping all tenants in a same DB using discriminator is not yet supported. We have used this functionality successfully in our application (DB-per-client approach).

For (3): After some investigation done we decided to go with Braintree to implement billing. Another solutions many people recommend: Authorize.net, Stripe, PayPal.

For (4): We have used clustered configuration with Hibernate/Spring and JBoss Cache for 2nd level caching. At these days this became "common" and using PaaS services like Jelastic you can even get it pre-configured out of the box.



回答5:

What you describe is a full service Saas style application serving multiple tenants. There are a few things you have to decide like how critical is data isolation? If you are building for a medical or financial domain, data isolation is a critical factor.

Well, I cannot help answer all your points, but I would suggest looking at database-per-tenant approach for your application as it provides the highest level of data isolation.

Since you are using the Java, Spring, Hibernate stack, I can help you with a small example application I wrote. It is a working example which you can quickly run in your local laptop. I have shared it here. Do take a look and let me know if it answers some of your questions.