Best way to handle dirty state in an ORM model

2019-01-11 21:18发布

问题:

I don't want anyone saying "you should not reinvent the wheel, use an open source ORM"; I have an immediate requirement and cannot switch.

I'm doing a little ORM, that supports caching. Even not supporting caching, I would need this feature anyways, to know when to write an object to storage or not. The pattern is DataMapper.

Here is my approach:

  • I want to avoid runtime introspection (i.e. guessing attributes).
  • I don't want to use a CLI code generator to generate getters and setters (really I use the NetBeans one, using ALT+INSERT).
  • I want the model to be the closest to a POPO (plain old PHP object). I mean: private attributes, "hardcoded" getters and setters for each attribute.

I have an Abstract class called AbstractModel that all the models inherit. It has a public method called isDirty() with a private (can be protected too, if needed) attribute called is_dirty. It must return true or false depending if there is a change on the object data or not since it was loaded.

The issue is: is there a way to raise the internal flag "is_dirty" without coding in each setter $this->is_dirty = true? I mean: I want to have the setters as $this->attr = $value most of the time, except a code change is needed for business logic.

Other limitation is that I cannot rely on __set because on the concrete model class the attributes already exists as private, so __set is never called on the setters.

Any ideas? Code examples from others ORMs are accepted.

One of my ideas was to modify the NetBeans setters template, but I think there should be a way of doing this without relying on the IDE.

Another thought I had was creating the setters and then change the private attribute's name with an underscore or something. This way the setter would call to __set and have some code there to deal with the "is_dirty" flag, but this breaks the POPO concept a little, and it's ugly.

回答1:

Attantion!
My opinion on the subject has somewhat changed in past month. While the answer where is still valid, when dealing with large object graphs, I would recommend to use Unit-of-Work pattern instead. You can find a brief explanation of it in this ansewer

I'm kinda confused how what-you-call-Model is related to ORM. It's kinda confusing. Especially since in MVC the Model is a layer (at least, thats how i understand it, and your "Models" seem to me more like Domain Objects).

I will assume that what you have is a code that looks like this:

  $model = new SomeModel;
  $mapper = $ormFactory->build('something');

  $model->setId( 1337 );
  $mapper->pull( $model );

  $model->setPayload('cogito ergo sum');

  $mapper->push( $model );

And, i will assume that what-you-call-Model has two methods, designer to be used by data mappers: getParameters() and setParameters(). And that you call isDirty() before mapper stores what-you-call-Model's state and call cleanState() - when mapper pull data into what-you-call-Model.

BTW, if you have a better suggestion for getting values from-and-to data mappers instead of setParameters() and getParameters(), please share, because i have been struggling to come up with something better. This seems to me like encapsulation leak.

This would make the data mapper methods look like:

  public function pull( Parametrized $object )
  {
      if ( !$object->isDirty() )
      {
          // there were NO conditions set on clean object
          // or the values have not changed since last pull
          return false; // or maybe throw exception
      }

      $data = // do stuff which read information from storage

      $object->setParameters( $data );
      $object->cleanState();

      return $true; // or leave out ,if alternative as exception
  }

  public static function push( Parametrized $object )
  {
      if ( !$object->isDirty() )
      {
          // there is nothing to save, go away
          return false; // or maybe throw exception
      }

      $data = $object->getParameters();
      // save values in storage
      $object->cleanState();

      return $true; // or leave out ,if alternative as exception
  }

In the code snippet Parametrized is a name of interface, which object should be implementing. In this case the methods getParameters() and setParameters(). And it has such a strange name, because in OOP, the implements word means has-abilities-of , while the extends means is-a.

Up to this part you should already have everything similar...


Now here is what the isDirty() and cleanState() methods should do:

  public function cleanState()
  {
      $this->is_dirty = false;
      $temp = get_object_vars($this);
      unset( $temp['variableChecksum'] );
      // checksum should not be part of itself
      $this->variableChecksum = md5( serialize( $temp ) );
  }

  public function isDirty()
  {
      if ( $this->is_dirty === true )
      {
          return true;
      }

      $previous = $this->variableChecksum;

      $temp = get_object_vars($this);
      unset( $temp['variableChecksum'] );
      // checksum should not be part of itself
      $this->variableChecksum = md5( serialize( $temp ) );

      return $previous !== $this->variableChecksum;
  }


回答2:

I would make a proxy to set for example:

class BaseModel {

   protected function _set($attr, $value) {
      $current = $this->_get($attr);
      if($value !== $current) {
         $this->is_dirty = true;
      }

      $this->$attr = $value;
   }
}

Then each child class would implemnt its setter by calling _set() and never set the property directly. Further, you can always inject more class specific code into each sub class's _set and just call parent::set($attr, $processedValue) if needed. Then if you want to use magic methods you make those proxy to property method that proxies to _set. I suppose this isnt very POPO though.



回答3:

though this post is old BUT how about using events to notify listeners when isDirty() happens? I would approach the solution with events.