Entity Framework AsNoTracking breaks call to Disti

2019-04-06 12:46发布

I am trying to load a list of distinct colors from previously loaded list of products on a page. So to pull in the products I do this:

var products = Products
    .Include(p => p.ProductColor)
    .ToList();

Then I do some processing on the products them I want to get a list of all of the distinct colors used by the products, so I do this:

var colors = products   
    .Select(p => p.ProductColor)
    .Distinct();

And this works great, however if I add a call to .AsNoTracking() to the original products call, I now get an entry in my color list for each entry in the product list.

Why is there a difference in these two? Is there a way to keep Entity Framework from tracking the objects (they're being used for read only) and to get the desired behavior?

Here is my query after adding the call to AsNoTracking()

var products = Products
    .AsNoTracking()
    .Include(p => p.ProductColor)
    .ToList();

1条回答
神经病院院长
2楼-- · 2019-04-06 13:15

AsNoTracking "breaks" Distinct because AsNoTracking "breaks" identity mapping. Since entities loaded with AsNoTracking() won't get attached to the context cache EF materializes new entities for every row returned from the query whereas when tracking is enabled it would check if an entity with the same key value does already exist in the context and if yes, it wouldn't create a new object and just use the attached object instance instead.

For example, if you have 2 products and both are Green:

  • Without AsNoTracking() your query will materialize 3 objects: 2 Product objects and 1 ProductColor object (Green). Product 1 has a reference to Green (in ProductColor property) and Product 2 has a reference to the same object instance Green, i.e.

    object.ReferenceEquals(product1.ProductColor, product2.ProductColor) == true
    
  • With AsNoTracking() your query will materialize 4 objects: 2 product objects and 2 color objects (both represent Green and have the same key value). Product 1 has a reference to Green (in ProductColor property) and Product 2 has a reference to Green but this is another object instance, i.e.

    object.ReferenceEquals(product1.ProductColor, product2.ProductColor) == false
    

Now, if you call Distinct() on a collection in memory (LINQ-to-Objects) the default comparison for Distinct() without parameter is comparing object reference identities. So, in case 1 you get only 1 Green object, but in case 2 you'll get 2 Green objects.

To get the desired result after you have run the query with AsNoTracking() you need a comparison by the entity key. You can either use the second overload of Distinct which takes an IEqualityComparer as parameter. An example for its implementation is here and you would use the key property of ProductColor to compare two objects.

Or - which seems easier to me than the tedious IEqualityComparer implementation - you rewrite the Distinct() using a GroupBy (with the ProductColor key property as the grouping key):

var colors = products   
    .Select(p => p.ProductColor)
    .GroupBy(pc => pc.ProductColorId)
    .Select(g => g.First());

The First() basically means that you are throwing all duplicates away and just keep the first object instance per key value.

查看更多
登录 后发表回答