Implement IEquatable for POCO

2019-04-07 23:54发布

问题:

I noticed that EF's DbSet.Add() is quite slow. A little googling turned up a SO answer that promises up to 180x performance gains:

https://stackoverflow.com/a/7052504/141172

However, I do not understand exactly how to implement IEquatable<T> as suggested in the answer.

According to MSDN, if I implement IEquatable<T>, I should also override Equals() and GetHashCode().

As with many POCO's, my objects are mutable. Before being committed to the database (SaveChanges()), new objects have an Id of 0. After the objects have been saved, the Id serves as an ideal basis for implementing IEquatable, Equals() and GetHashCode().

It is unwise to include any mutable property in a hash code, and since according to MSDN

If two objects compare as equal, the GetHashCode method for each object must return the same value

Should I implement IEquatable<T> as a property-by-property comparison (e.g. this.FirstName == other.FirstName) and not override Equals() and GetHashCode()?

Given that my POCO's are used in an EntityFramework context, should any special attention be paid to the Id field?

回答1:

I came across your question in search for a solution to the same question. Here is a solution that I am trying out, see if it meets your needs:

First, all my POCOs derive from this abstract class:

public abstract class BasePOCO <T> : IEquatable<T> where T : class
{
    private readonly Guid _guid = Guid.NewGuid();

    #region IEquatable<T> Members

    public abstract bool Equals(T other);

    #endregion

    public override bool Equals(object obj)
    {
        if (ReferenceEquals(null, obj))
        {
            return false;
        }
        if (ReferenceEquals(this, obj))
        {
            return true;
        }
        if (obj.GetType() != typeof (T))
        {
            return false;
        }
        return Equals((T)obj);
    }

    public override int GetHashCode()
    {
        return _guid.GetHashCode();
    }
}

I created a readonly Guid field that I am using in the GetHashCode() override. This will ensure that were I to put the derived POCO into a Dictionary or something else that uses the hash, I would not orphan it if I called a .SaveChanges() in the interim and the ID field was updated by the base class This is the one part I'm not sure is completely correct, or if it is any better than just Base.GetHashCode()?. I abstracted the Equals(T other) method to ensure the implementing classes had to implement it in some meaningful way, most likely with the ID field. I put the Equals(object obj) override in this base class because it would probably be the same for all the derived classes too.

This would be an implementation of the abstract class:

public class Species : BasePOCO<Species>
{
    public int ID { get; set; }
    public string LegacyCode { get; set; }
    public string Name { get; set; }

    public override bool Equals(Species other)
    {
        if (ReferenceEquals(null, other))
        {
            return false;
        }
        if (ReferenceEquals(this, other))
        {
            return true;
        }
        return ID != 0 && 
               ID == other.ID && 
               LegacyCode == other.LegacyCode &&
               Name == other.Name;
    }
}

The ID property is set as the primary key in the Database and EF knows that. ID is 0 on a newly created objects, then gets set to a unique positive integer on .SaveChanges(). So in the overridden Equals(Species other) method, null objects are obviously not equal, same references obviously are, then we only need to check if the ID == 0. If it is, we will say that two objects of the same type that both have IDs of 0 are not equal. Otherwise, we will say they are equal if their properties are all the same.

I think this covers all the relevant situations, but please chime in if I am incorrect. Hope this helps.

=== Edit 1

I was thinking my GetHashCode() wasn't right, and I looked at this https://stackoverflow.com/a/371348/213169 answer regarding the subject. The implementation above would violate the constraint that objects returning Equals() == true must have the same hashcode.

Here is my second stab at it:

public abstract class BasePOCO <T> : IEquatable<T> where T : class
{
    #region IEquatable<T> Members

    public abstract bool Equals(T other);

    #endregion

    public abstract override bool Equals(object obj);
    public abstract override int GetHashCode();
}

And the implementation:

public class Species : BasePOCO<Species>
{
    public int ID { get; set; }
    public string LegacyCode { get; set; }
    public string Name { get; set; }

    public override bool Equals(Species other)
    {
        if (ReferenceEquals(null, other))
        {
            return false;
        }
        if (ReferenceEquals(this, other))
        {
            return true;
        }
        return ID != 0 && 
        ID == other.ID && 
        LegacyCode == other.LegacyCode && 
        Name == other.Name;
    }

    public override bool Equals(object obj)
    {
        if (ReferenceEquals(null, obj))
        {
            return false;
        }
        if (ReferenceEquals(this, obj))
        {
            return true;
        }
        return Equals(obj as Species);
    }

    public override int GetHashCode()
    {
        unchecked
        {
            return ((LegacyCode != null ? LegacyCode.GetHashCode() : 0) * 397) ^ 
                   (Name != null ? Name.GetHashCode() : 0);
        }
    }

    public static bool operator ==(Species left, Species right)
    {
        return Equals(left, right);
    }

    public static bool operator !=(Species left, Species right)
    {
        return !Equals(left, right);
    }
}

So I got rid of the Guid in the base class and moved GetHashCode to the implementation. I used Resharper's implementation of GetHashCode with all the properties except ID, since ID could change (don't want orphans). This will meet the constraint on equality in the linked answer above.



回答2:

As with many POCO's, my objects are mutable

But tehy should NOT be mutable on the fields that are the primary key. Per defintiion, or you are in a world of pain database wise anyway later.

Generate the HashCode ONLY on the fields of the primay key.

Equals() must return true IFF the participating objects have the same hash code

BZZZ - Error.

Hashcodes are double. It is possible for 2 objects to have different values and the smae hashcode. A hsahsode is an int (32bit). A string can be 2gb long. You can not mapp every possible string to a separate hashcode.

IF two objects have the same hashcode, they may be diferent. If two objects are the same, they can NOT have different hashcodes.

Where do you get the idea that Equals must return true for objects with the same hashcode?

Also, PCO or not, an object mapped to a database and used in a relation MUST have a stable primary key (which can be used to run the hashcode calculation). An object not having this STIL lshould have primary key (per SQL Server requirements), using a sequence / artificial primary key works here. Again, use that to run the HashCode calculation.



回答3:

First thing first: Sorry my lame English :)

As TomTom say, they shouldn't be mutable just because they still not received PK/Id...

In our EF:CF system, we use generated negative id (assigned in base class ctor or, if you use ProxyTracking, in ObjectMaterialized event) for every new POCO. Its pretty simple idea:

public static class IdKeeper
{
  private static int m_Current = int.MinValue;
  private static Next()
  {
    return ++m_Current;
  }
}

MinValue and incremen should be important, because EF will sort POCOs by their PK before committing changes to db and when you use "-1, -2, -3", POCOs are saved flipped, which in some cases (not according to what sort) may not be ideal.

public abstract class IdBase
{
  public virtual int Id { get; set; }
  protected IdBase()
  {
    Id = IdKeeper.Next();
  }
}

If POCO is materialized from DB, his Id will be override with actual PK as well as when you call SaveChanges(). And as bonus, every single "not yet saved" POCO id will be unique (that should come handy one day ;) )

Comparing two POCO with IEquatable (why does dbset work so slow) is then easy:

public class Person
  : IdBase, IEquatable<Person>
{
  public virtual string FirstName { get; set; }

  public bool Equals(Person other)
  {
    return Id == other.Id;
  }
}