SUM on navigation property as Linq to SQL

2019-07-29 06:21发布

I am trying to find a solution how to execute TotalConversions => Statistics.Sum(sum => sum.Conversions) on database server using Linq to SQL.

The problem with current code is that Statistics is ICollection(IEnumerable) and not IQueryable and SUM function fetch all records from database locally and only when SUM the results. This is not solution for us because Statistics contain thousands of records.

public class User : Entity
{
   public int Id { get; set; }
   public virtual ICollection<Statistic> Statistics { get; set; }
   [NotMapped]
   public int TotalConversions => Statistics.Sum(sum => sum.Conversions);
}

Another solution which we tried is to access DbContext inside Model and execute Linq to SQL query like this

[NotMapped]
public int TotalConversions
{
    get
    {
        if (_totalConversions == null)
        {
            var databaseContext = GetDbContextFromEntity(this);
            _totalConversions = databaseContext.Statistic.Where(s => s.UserId == Id).Sum(s => s.Conversions);
        }
        return (int)_totalConversions;
    }
}

It works fine but another problem comes here that such property can't be used inside linq query like this

_context.User.Where(w=>w.Id==1 && w.Id == 2).OrderBy(o=>o.TotalConversions)

How to perform SUM in calculated property inside Model which would be executed on database server and also could be used inside select queries. Is that possible at all with EF?

1条回答
Ridiculous、
2楼-- · 2019-07-29 07:13

The problem.

[NotMapped]
public int TotalConversions => Statistics.Sum(sum => sum.Conversions);

When you mark a property as [NotMapped], you're telling EF that this property should not exist on the database level. It will only be available in your code (i.e. when the data is in-memory, not when it is in-database).

Marking this property as [NotMapped] inherently precludes you from using a database operation (order by) using the TotalConversions property.

I see two solutions here. One is simple, the other less so.


Simple: Use the calculation in the OrderBy without a property.

You've already pretty much done this in your second example, but you can do it a bit more concisely, avoiding use of a custom property.

What you want to do is this:

_context.User.Where(w=>w.Id==1 && w.Id == 2).OrderBy(o => o.TotalConversions)

You can't do this, as I mentioned. What you can do is:

_context.User.Where(w=>w.Id==1 && w.Id == 2).OrderBy(o => o.Statistics.Sum(sum => sum.Conversions))

Sum() has an SQL equivalent, so Linq2SQL is able to convert it. Essentially, you have changed your request from "order the collection based on this .Net property" (which is impossible) to "order the collection based on this SQL-compatible evaluation".


However, I assume that you're trying to use a custom property in order to avoid having to copy/paste that same calculation all over your codebase. If that is your goal, I fully agree with your intentions, and that's a good enough reason to try the more complex alternative.

More complex: define your needed OrderBy parameter as a custom static User property.

You can parametrize the OrderBy parameter. First things first, The OrderBy method is a generic method with two generic types:

IOrderedQueryable<A> OrderBy<A,B>(Expression<Func<A,B>> expression) { }

Note: This signature is simplified for the sake of example, to show you the basic input and output of an OrderBy method.

A is your entity type, in your current case that is User.

B is the type of your sorting argument. You're sorting by an integer, so B is int.
Note: Almost always, B can be inferred by the compiler (based on the lambda method you use) and doesn't need to be declared explicitly. However, for this solution you will need to know its type.

This means that for your current case, we know that the OrderBy will take a parameter of type Expression<Func<User,int>>. You can confirm this by hovering over your Orderby, IntelliSense will tell you the types of the generic parameters as they are currently defined.

Instead of defining an int property like this:

[NotMapped]
public int TotalConversions => Statistics.Sum(sum => sum.Conversions);

You can define an expression property like this:

[NotMapped]
public static Expression<Func<User,int>> TotalConversionsLambda = (user => user.Statistics.Sum(sum => sum.Conversions));

Notice that I made it static, because this lambda is not tied to a specific User object, we want to access it without relying on an existing User.
This is a key difference. The int property basically says "this is how you get the total conversions for this User", whereas the expression property says "this is the method for retrieving the total conversions for any given User". Instead of defining the value, we are defining the method that will give us the needed value.

This means that you can use this predefined lambda when you call OrderBy:

_context.User.Where(w=>w.Id==1 && w.Id == 2).OrderBy(User.TotalConversionsLambda)

To the compiler (and EF), this is equivalent to the solution from the simpler method and will therefore work the same way. However, this has the added benefit of defining the lambda once (DRY) instead of copy/pasting it everywhere in the codebase.


The explanation.

Parametrizing expressions can be a bit bewildering. That was the case for me when I started using them, at least. So maybe a simpler explanation is in order.

Notice that we can swap a literal value:

DoSomething(5);

for a variable that contains the value:

int myValue = 5;
DoSomething(myValue);

This example uses an integer, but we can do this for any type (string, bool, ... should be obvious). This also works for reference types:

DoSomething(new User() { Name = "John Doe" });

Versus:

User john = new User() { Name = "John Doe" };
DoSomething(john);

An Expression<> is a bit convoluted (due to its complex generic typing and the lambda notation), but it works exactly the same:

DoSomething(foo => foo.BarValue);

Versus:

Expression<Func<Foo,Bar>> myValue = (foo => foo.BarValue);
DoSomething(myValue);

And that's all there is to it. The type is more complex, but the underlying principle is exactly the same.


Edit - Small addition

You can create an additional int property to use in code.

[NotMapped]
public static Expression<Func<User,int>> TotalConversionsLambda = (user => user.Statistics.Sum(sum => sum.Conversions));

[NotMapped]
public int TotalConversions
{
    get
    {
        return TotalConversionsLambda.Compile().Invoke(this);
    }
}

Honestly, I do think that the repeated compilation of the expression may end up hurting you more performance-wise, compared to simply defining a property that works independently of the expression property:

[NotMapped]
public static Expression<Func<User,int>> TotalConversionsLambda = (user => user.Statistics.Sum(sum => sum.Conversions));

[NotMapped]
public int TotalConversions
{
    get
    {
        return this.Statistics.Sum(sum => sum.Conversions);
    }
}

The choice is yours. You can use the first one if you want to (pedantically) adhere to DRY; but the performance cost of doing so may end up hurting you more than adhering to DRY will benefit you.

I don't have the exact numbers on the performance cost of compiling an expression, and I don't know where your priorities lie (performance over DRY? DRY over performance?), so I can't make the decision for you.

查看更多
登录 后发表回答