Linq to entities extension method inner query (EF6

2019-06-18 16:13发布

问题:

Can someone explain to me why the EF Engine is failing in the following scenario?

It works fine with the following expression:

var data = context.Programs
    .Select(d => new MyDataDto
    {
        ProgramId = d.ProgramId,
        ProgramName = d.ProgramName,
        ClientId = d.ClientId,
        Protocols = d.Protocols.Where(p => p.UserProtocols.Any(u => u.UserId == userId))
                .Count(pr => pr.Programs.Any(pg => pg.ProgramId == d.ProgramId))
    })
    .ToList();

But if I encapsulate some into an extension method:

public static IQueryable<Protocol> ForUser(this IQueryable<Protocol> protocols, int userId)
{
    return protocols.Where(p => p.UserProtocols.Any(u => u.UserId == userId));
}

The resulting query:

var data = context.Programs
    .Select(d => new MyDataDto
    {
        ProgramId = d.ProgramId,
        ProgramName = d.ProgramName,
        ClientId = d.ClientId,
        Protocols = d.Protocols.ForUser(userId)
                .Count(pr => pr.Programs.Any(pg => pg.ProgramId == d.ProgramId))
    })
    .ToList();

Fails with the exception: LINQ to Entities does not recognize the method 'System.Linq.IQueryable1[DAL.Protocol] ForUser(System.Linq.IQueryable1[DAL.Protocol], Int32)' method, and this method cannot be translated into a store expression.

I would expect the EF Engine to build the entire expression tree, chaining the necessary expressions and then generate the SQL. Why doesn't it do that?

回答1:

This is happening because the call to ForUser() is being made inside of the expression tree that the C# compiler builds when it sees the lambda you pass into Select. Entity Framework tries to figure out how to convert that function into SQL, but it can't invoke the function for a few reasons (e.g. d.Protocols does not exist at the moment).

The simplest approach that works for a case like this is to have your helper return a criteria lambda expression, and then pass that into the .Where() method yourself:

public static Expression<Func<Protocol, true>> ProtocolIsForUser(int userId)
{
    return p => p.UserProtocols.Any(u => u.UserId == userId);
}

...

var protocolCriteria = Helpers.ProtocolIsForUser(userId);
var data = context.Programs
    .Select(d => new MyDataDto
    {
        ProgramId = d.ProgramId,
        ProgramName = d.ProgramName,
        ClientId = d.ClientId,
        Protocols = d.Protocols.Count(protocolCriteria)
    })
    .ToList();

More information

When you invoke a LINQ method outside of an expression tree (like you do with context.Programs.Select(...)), the Queryable.Select() extension method actually gets invoked, and its implementation returns an IQueryable<> that represents the extension method getting called on the original IQueryable<>. Here's the implementation of Select, for instance:

    public static IQueryable<TResult> Select<TSource,TResult>(this IQueryable<TSource> source, Expression<Func<TSource, TResult>> selector) {
        if (source == null)
            throw Error.ArgumentNull("source");
        if (selector == null)
            throw Error.ArgumentNull("selector");
        return source.Provider.CreateQuery<TResult>( 
            Expression.Call(
                null,
                GetMethodInfo(Queryable.Select, source, selector),
                new Expression[] { source.Expression, Expression.Quote(selector) }
                ));
    }

When the queryable's Provider has to generate actual data from the IQueryable<>, it analyzes the expression tree and tries to figure out how to interpret those method calls. Entity Framework has built-in knowledge of many LINQ-related functions like .Where() and .Select(), so it knows how to translate those method calls into SQL. However, it doesn't know what to do for methods that you write.

So why does this work?

var data = context.Programs.ForUser(userId);

The answer is that your ForUser method is not implemented like the Select method above: you are not adding an expression to the queryable to represent calling ForUser. Instead, you are returning the result of a .Where() call. From the IQueryable<>'s perspective, it's as if Where() was called directly, and the call to ForUser() never happened.

You can prove this by capturing the Expression property on the IQueryable<>:

Console.WriteLine(data.Expression.ToString());

... which will produce something like this:

Programs.Where(u => (u.UserId == value(Helpers<>c__DisplayClass1_0).userId))

There's no call to ForUser() anywhere in that expression.

On the other hand, if you include the ForUser() call inside of an expression tree like this:

var data = context.Programs.Select(d => d.Protocols.ForUser(id));

... then the .ForUser() method never actually gets invoked, so it never returns an IQueryable<> that knows the .Where() method got called. Instead, the expression tree for the queryable shows .ForUser() getting invoked. Outputting its expression tree would look something like this:

Programs.Select(d => d.Protocols.ForUser(value(Repository<>c__DisplayClass1_0).userId))

Entity Framework has no idea what ForUser() is supposed to do. As far as it's concerned, you could have written ForUser() to do something that's impossible to do in SQL. So it tells you that's not a supported method.



回答2:

As I mentioned in my comment above, I can't tell why the EF Engine is working the way it is. Therefore, I've tried to find a way to re-write the query so I'll be able to make use of my extension methods.

The tables are:

Program -> 1..m -> ProgramProtocol -> m..1 -> Protocol

ProgramProtocol is just a join table and is not mapped in the model by Entity Framework. The idea is simple: select "from left", select "from right" and then join the resulted sets for proper filtering:

var data = context.Programs.ForUser(userId)
    .SelectMany(pm => pm.Protocols,
        (pm, pt) => new {pm.ProgramId, pm.ProgramName, pm.ClientId, pt.ProtocolId})
    .Join(context.Protocols.ForUser(userId), pm => pm.ProtocolId,
        pt => pt.ProtocolId, (pm, pt) => pm)
    .GroupBy(pm => new {pm.ProgramId, pm.ProgramName, pm.ClientId})
    .Select(d => new MyDataDto
    {
        ProgramName = d.Key.ProgramName,
        ProgramId = d.Key.ProgramId,
        ClientId = d.Key.ClientId,
        Protocols = d.Count()
    })
    .ToList();