Tracking changes in complex object graph

2019-01-22 22:49发布

问题:

I started to think about tracking changes in complex object graph in disconnected application. I have already found several solutions but I would like to know if there is any best practice or what solution do you use and why? I passed same question to MSDN forum but I received only single answer. I would like to have more answers to learn from experience of other developers.

This question is related to .NET so for answers with implementation details I prefer answers related to .NET world but I think this is the same on other platforms.

The theoretical problem in my case is defined in multi layered architecture (not necessarily n-tier at the moment) as follows:

  • Layer of repositories using ORM to deal with persistence (ORM tool doesn't matter at the moment but it will most probably be Entity Framework 4.0 or NHibernate).
  • Set of pure classes (persistent ignorant = POCO which is equivalent of POJO in Java world) representing domain objects. Repositories persists those classes and returns them as results of queries.
  • Set of Domain services working with domain entities.
  • Facade layer defining gateway to business logic. Internally it uses repositories, domain services and domain objects. Domain objects are not exposed - each facade method uses set of specialized Data transfer objects for parameter and return value. It is responsibility of each facade method to transform domain entity to DTO and vice-versa.
  • Modern web application which uses facade layer and DTOs - I call this disconnected application. Generally design can change in future so that Facade layer will be wrapped by web service layer and web application will consume that services => transition to 3-tier (web, business logic, database).

Now suppose that one of the domain object is Order which has Order details (lines) and related Orders. When the client requests Order for editation it can modify Order, add, remove or modify any Order detail and add or remove related Orders. All these modifications are done on data in the web browser - javascript and AJAX. So all changes are submited in single shot when client pushes the save button. The question is how to handle these changes? Repository and ORM tool need to know which entities and relationships were modified, inserted or deleted. I ended with two "best" solutions:

  1. Store initial state of DTO in hidden field (in worse case to session). When receiving request to save changes create new DTO based on received data and second DTO based on persisted Data. Merge those two and track changes. Send merged DTO to facade layer and use received information about changes to properly set up entity graph. This requires some manual change tracking in domain object so that change information can be set up from scratch and later on passed to repository - this is the point I am not very happy with.

  2. Do not track changes in DTO at all. When receiving modified data in facade layer create modified entity and load actual state from repository (generally additional query to database - this is the point I am not very happy with) - merge these two entities and automatically track changes by entity proxy provided by ORM tool (Entity framework 4.0 and NHibernate allow this). Special care is needed for concurrency handling because actual state does not have to be the initial state.

What do you think about that? What do you recommend?

I know that some of these challenges can be avoided by using caching on some application layers but that is something I don't want to use at the moment.

My interest in this topic goes even futher. For example suppose that application goes to 3-tier architecture and client (web application) will not be written in .NET = DTO classes can't be reused. Tracking changes on DTO will than be much harder because it will require other development team to properly implement tracking mechanism in their development tools.

I believe these problems have to be solved in plenty of applications, please share you experience.

回答1:

It's all about responsibility.

(I'm not sure if this is the sort of answer you're after - let me know if it's not so I can update it).

So we have multiple layers in a system - each is responsible for a different task: data access, UI, business logic, etc. When we architect a system in this way we are (amongst other things) trying to make future change easy by making each component responsible for one task - so it can focus on that one task and do it well. It also makes it easier to modify the system as time passes and change is neeed.

Similar thoughts need to be in mind when considering the DTO - "how to track changes?" for example. Here's how I approach it: The BL is responsible for managing rules and logic; given the stateless nature of the web (which is where I do most of my work) I'm just not tracking the state of an object and looking explicitly for changes. If a user is passing data back (to be saved / updated) I'll pass the whole lot back without caring what's been changed.

One one hand this might seem inefficient but as the amounts of data aren't vast it's just not an issue; on the flipside, there's less "moving parts" less can go wrong as the process is much simpler.

How I pass the data back? -

  • I use DTO's (or perhaps POCO's would be more accurate); when I exchange data between the BL and DAL (via interfaces / DI) the data is exchanged as a DTO (or collection of them). Specifically, I'm using a struct for a single instance and a collection of these structs for multiple.

  • The DTO's are defined in a common class that has very few dependencies.

  • I deliberately try to limit the number of DTO's a create for a specific object (like "Order") - but at the same time I'll make new ones if there is a good reason. Typically I'll have a "fat" DTO which contains most / all of the data available for that object, I'll also probably have a much leaner one that's designed to be used in collections (for lists, etc). In both cases these DTO's are pureyl for returning info for "reading". You have to keep the responsibilities in mind - when the BL asks for data it's usually not trying to write data back at the same time; so the fact that the DTO is "read only" is more about conforming to a clean interface and architecture than a business rule.

  • I always define seperate DTO's for Inserting and Updating - even if they share exactly the same fields. This way the worst that can happen is duplication of some trival code - as opposed to having dependancies and multiple re-use cases to untangle.

Finally - don't confuse how the DAL works with how the UI does; Let ORM's do their thing, just because they store the data in a given way doesn't mean it's the only way.

The most important thing is to specify meaningful interfaces between your layers.

Managing what's changed is the job of the BL; let the UI work in a way that's best for your users and let the BL figure out how it wants to deal with that, and the DAL (via your nice clean interface with DI) just does what it's told.



回答2:

our architecture is very similar to yours, but using a Silverlight client containing the same domain objects (that no exact - the code is shared) on client and server. The key points of our architecture is in short

  • The client has the domain model and the changes are tracked using my own implemented tracking framework (it uses AOP so that I can use POCOs on the client side; I do not know a framework for that and I want the domain model to be persistent ignorant)
  • This whole model is stored in a kind of remote repository on the client. When we save those changes, a changes tree will be extracted (by my change tracking framework) and translated in DTOs (exactly DataContracts but that doesn't matter) using assemblers. The DTOs have a tracking state flag (new, modified, deleted).
  • On the server side (the service layer is implemented by WCF webservices) the DTOs are translated to domain objects and attached to the ORM (NHibernate in our case). Because of this attachmentent process I need the tracking state. Than additional changes can be done and persisted via ORM

Currently it is difficult to attach complex graphs to the ORM but I'll hope the reason is that I we do not have much experience in using NHibernate.

We're not finished yet but it seems to be promising.

For data access we tried to use WCF Dataservices. But I don't think that we are going to use them because a requirement is using DataContract. That leads to a translation from DataContract based LINQ queries to domain object based LINQ queries. That is not handy and too difficult to implement if domain model and datacontracts differ very much (that will be the case in some iterations).

Any considerations?