I am trying to load a list of distinct colors from previously loaded list of products on a page. So to pull in the products I do this:
var products = Products
.Include(p => p.ProductColor)
.ToList();
Then I do some processing on the products them I want to get a list of all of the distinct colors used by the products, so I do this:
var colors = products
.Select(p => p.ProductColor)
.Distinct();
And this works great, however if I add a call to .AsNoTracking()
to the original products call, I now get an entry in my color list for each entry in the product list.
Why is there a difference in these two? Is there a way to keep Entity Framework from tracking the objects (they're being used for read only) and to get the desired behavior?
Here is my query after adding the call to AsNoTracking()
var products = Products
.AsNoTracking()
.Include(p => p.ProductColor)
.ToList();
AsNoTracking
"breaks"Distinct
becauseAsNoTracking
"breaks" identity mapping. Since entities loaded withAsNoTracking()
won't get attached to the context cache EF materializes new entities for every row returned from the query whereas when tracking is enabled it would check if an entity with the same key value does already exist in the context and if yes, it wouldn't create a new object and just use the attached object instance instead.For example, if you have 2 products and both are Green:
Without
AsNoTracking()
your query will materialize 3 objects: 2Product
objects and 1ProductColor
object (Green). Product 1 has a reference to Green (inProductColor
property) and Product 2 has a reference to the same object instance Green, i.e.With
AsNoTracking()
your query will materialize 4 objects: 2 product objects and 2 color objects (both represent Green and have the same key value). Product 1 has a reference to Green (inProductColor
property) and Product 2 has a reference to Green but this is another object instance, i.e.Now, if you call
Distinct()
on a collection in memory (LINQ-to-Objects) the default comparison forDistinct()
without parameter is comparing object reference identities. So, in case 1 you get only 1 Green object, but in case 2 you'll get 2 Green objects.To get the desired result after you have run the query with
AsNoTracking()
you need a comparison by the entity key. You can either use the second overload ofDistinct
which takes anIEqualityComparer
as parameter. An example for its implementation is here and you would use the key property ofProductColor
to compare two objects.Or - which seems easier to me than the tedious
IEqualityComparer
implementation - you rewrite theDistinct()
using aGroupBy
(with theProductColor
key property as the grouping key):The
First()
basically means that you are throwing all duplicates away and just keep the first object instance per key value.