I'm looking to convert a relatively new web-based application with a clear domain model over to more of a CQRS style system. My new application is essentially an enhanced replacement of an older existing system.
The existing systems in my organization share a set of common databases, which are updated by an untold number of applications (developed via the Chaos Method) that exist in silos throughout the company. (As it stands, I believe that no single person in the company can identify them all.)
My question is therefore about the read model(s) for my application. Since various status changes, general user data, etc. are updated by other applications outside my control, what's the best way to handle building the read models in such a way that I can deal with outside updates, but still keep things relatively simple?
I've considered the following so far:
- Create Views in the database for read models, that read all tables, legacy and new
- Add triggers to existing tables to update new read model tables
- Add some code to the database (CLR Stored proc/etc [sql server]) to update an outside datastore for read models
- Abandon hope
What is the general consensus on how to approach this? Is it folly to think I can bring order to a legacy system without fully rewriting everything from scratch?
I've used option #1 with success. Creating views to demoralize the data to create a read model is a viable option depending on the complexity of the write database(s). Meaning, if it is relatively straight forward joins that most developers can understand then I would take a closer look to see if it's viable for you. I would be careful with having too much complexity in these views.
Another thing to consider is periodic polling to build and update similar to a traditional reporting databases. Although not optimal in comparison to a notification, depending on how stale your read model can be, this might also be an option to look at.
I once was in a similar situation, the following steps was how i did it:
To improve the legacy system and achieve cleaner code base, the key is to take over the write responsibility. But don't be too ambitious as this may introduce interface/contract changing which makes the final deployment risky.
If all the write are fired through anything except direct sql updates, keep them backward compatible as possible as you can. Take them as adapters/clients of your newly developed command handlers.
Some of the write are direct sql updates but out of your control
Ask the team in charge if they can change to your new interface/contract?
If no, see step 3.
Ask if they can tolerate eventual consistency and are willing to replace the sql updates with database procedures?
If yes, put all the sql updates in the procedures and schedule a deployment and see step 4.
If no, maybe you should include them in your refactoring.
Modify the procedure, replace the sql updates with inserting events. And develop a backend job to roll the events and publish them. Make your new application subscribing these events to fire commands to your command handlers.
Emitting events from your command handlers and use them to update the tables that other applications used to consume.
Move to the next part of the legacy system.
If we had an infinitely powerful server, we wouldn't bother with view models and would instead just read from the basic entities tables. View models are meant to improve performance by preparing and maintaining an appropriate dataset for display. If you use a database View as a view model, you've really not gotten much of a performance gain over an adhoc query (if you ignore the preplanning that the sql parser can do for a view).
I've had success with a solution that's less intrusive than @Hippoom's solution, but more responsive than @Derek's. If you have access to the legacy system and can make minor code changes, you can add an asynchronous queue write to generate an event in a queueing system (RabbitMQ, Kafka, or whatever) in the legacy system repositories or wherever data is persisted. Making these writes asynch should not introduce any significant performance costs, and should the queue write fail it will not affect the legacy system. This change is also fairly easy to get through QA.
Then write an event driven piece that updates your read models. During the legacy system update phase (which can take a while), or if you only have access to some of the legacy systems that write to these databases, you can have a small utility that puts a new "UpdateViewModel" event in the queue every couple minutes. Then you would get timely events when the legacy systems save something significant, but are also covered for the systems that you are not able to update.