I noticed that EF's DbSet.Add() is quite slow. A little googling turned up a SO answer that promises up to 180x performance gains:
https://stackoverflow.com/a/7052504/141172
However, I do not understand exactly how to implement IEquatable<T>
as suggested in the answer.
According to MSDN, if I implement IEquatable<T>
, I should also override Equals()
and GetHashCode()
.
As with many POCO's, my objects are mutable. Before being committed to the database (SaveChanges()
), new objects have an Id of 0. After the objects have been saved, the Id serves as an ideal basis for implementing IEquatable, Equals() and GetHashCode().
It is unwise to include any mutable property in a hash code, and since according to MSDN
If two objects compare as equal, the GetHashCode method for each object must return the same value
Should I implement IEquatable<T>
as a property-by-property comparison (e.g. this.FirstName == other.FirstName
) and not override Equals() and GetHashCode()?
Given that my POCO's are used in an EntityFramework context, should any special attention be paid to the Id field?
First thing first: Sorry my lame English :)
As TomTom say, they shouldn't be mutable just because they still not received PK/Id...
In our EF:CF system, we use generated negative id (assigned in base class ctor or, if you use ProxyTracking, in ObjectMaterialized event) for every new POCO. Its pretty simple idea:
MinValue and incremen should be important, because EF will sort POCOs by their PK before committing changes to db and when you use "-1, -2, -3", POCOs are saved flipped, which in some cases (not according to what sort) may not be ideal.
If POCO is materialized from DB, his Id will be override with actual PK as well as when you call SaveChanges(). And as bonus, every single "not yet saved" POCO id will be unique (that should come handy one day ;) )
Comparing two POCO with IEquatable (why does dbset work so slow) is then easy:
I came across your question in search for a solution to the same question. Here is a solution that I am trying out, see if it meets your needs:
First, all my POCOs derive from this abstract class:
I created a readonly Guid field that I am using in the GetHashCode() override. This will ensure that were I to put the derived POCO into a Dictionary or something else that uses the hash, I would not orphan it if I called a .SaveChanges() in the interim and the ID field was updated by the base class This is the one part I'm not sure is completely correct, or if it is any better than just Base.GetHashCode()?. I abstracted the Equals(T other) method to ensure the implementing classes had to implement it in some meaningful way, most likely with the ID field. I put the Equals(object obj) override in this base class because it would probably be the same for all the derived classes too.
This would be an implementation of the abstract class:
The ID property is set as the primary key in the Database and EF knows that. ID is 0 on a newly created objects, then gets set to a unique positive integer on .SaveChanges(). So in the overridden Equals(Species other) method, null objects are obviously not equal, same references obviously are, then we only need to check if the ID == 0. If it is, we will say that two objects of the same type that both have IDs of 0 are not equal. Otherwise, we will say they are equal if their properties are all the same.
I think this covers all the relevant situations, but please chime in if I am incorrect. Hope this helps.
=== Edit 1
I was thinking my GetHashCode() wasn't right, and I looked at this https://stackoverflow.com/a/371348/213169 answer regarding the subject. The implementation above would violate the constraint that objects returning Equals() == true must have the same hashcode.
Here is my second stab at it:
And the implementation:
So I got rid of the Guid in the base class and moved GetHashCode to the implementation. I used Resharper's implementation of GetHashCode with all the properties except ID, since ID could change (don't want orphans). This will meet the constraint on equality in the linked answer above.
But tehy should NOT be mutable on the fields that are the primary key. Per defintiion, or you are in a world of pain database wise anyway later.
Generate the HashCode ONLY on the fields of the primay key.
BZZZ - Error.
Hashcodes are double. It is possible for 2 objects to have different values and the smae hashcode. A hsahsode is an int (32bit). A string can be 2gb long. You can not mapp every possible string to a separate hashcode.
IF two objects have the same hashcode, they may be diferent. If two objects are the same, they can NOT have different hashcodes.
Where do you get the idea that Equals must return true for objects with the same hashcode?
Also, PCO or not, an object mapped to a database and used in a relation MUST have a stable primary key (which can be used to run the hashcode calculation). An object not having this STIL lshould have primary key (per SQL Server requirements), using a sequence / artificial primary key works here. Again, use that to run the HashCode calculation.