Why would I want to use an ExpressionVisitor?

2020-05-21 12:43发布

I know from the MSDN's article about How to: Modify Expression Trees what an ExpressionVisitor is supposed to do. It should modify expressions.

Their example is however pretty unrealistic so I was wondering why would I need it? Could you name some real-world cases where it would make sense to modify an expression tree? Or, why does it have to be modified at all? From what to what?

It has also many overloads for visiting all kinds of expressions. How do I know when I should use any of them and what should they return? I saw people using VisitParameter and returning base.VisitParameter(node) the other on the other hand were returning Expression.Parameter(..).

4条回答
beautiful°
2楼-- · 2020-05-21 12:50

Could you name some real-world cases where it would make sense to modify an expression tree?

Strictly speaking, we never modify an expression tree, as they are immutable (as seen from the outside, at least, there's no promise that it doesn't internally memoise values or otherwise have mutable private state). It's precisely because they are immutable and hence we can't just change a node that the visitor pattern makes a lot of sense if we want to create a new expression tree that is based on the one we have but different in some particular way (the closest thing we have to modifying an immutable object).

We can find a few within Linq itself.

In many ways the simplest Linq provider is the linq-to-objects provider that works on enumerable objects in memory.

When it receives enumerables directly as IEnumerable<T> objects it's pretty straight-forward in that most programmers could write an unoptimised version of most of the methods pretty quickly. E.g. Where is just:

foreach (T item in source)
  if (pred(item))
    yield return item;

And so on. But what about EnumerableQueryable implementing the IQueryable<T> versions? Since the EnumerableQueryable wraps an IEnumerable<T> we could do the desired operation on the one or more enumerable objects involved, but we have an expression describing that operation in terms of IQueryable<T> and other expressions for selectors, predicates, etc, where what we need is a description of that operation in terms of IEnumerable<T> and delegates for selectors, predicates, etc.

System.Linq.EnumerableRewriter is an implementation of ExpressionVisitor does exactly such a re-write, and the result can then simply be compiled and executed.

Within System.Linq.Expressions itself there are a few implementations of ExpressionVisitor for different purposes. One example is that the interpreter form of compilation can't handle hoisted variables in quoted expressions directly, so it uses a visitor to rewrite it into working on indices into a a dictionary.

As well as producing another expression, an ExpressionVisitor can produce another result. Again System.Linq.Expressions has internal examples itself, with debug strings and ToString() for many expression types working by visiting the expression in question.

This can (though it doesn't have to be) be the approach used by a database-querying linq provider to turn an expression into a SQL query.

How do I know when I should use any of them and what should they return?

The default implementation of these methods will:

  1. If the expression can have no child expressions (e.g. the result of Expression.Constant()) then it will return the node back again.
  2. Otherwise visit all the child expressions, and then call Update on the expression in question, passing the results back. Update in turn will either return a new node of the same type with the new children, or return the same node back again if the children weren't changed.

As such, if you don't know you need to explicitly operate on a node for whatever your purposes are, then you probably don't need to change it. It also means that Update is a convenient way to get a new version of a node for a partial change. But just what "whatever your purposes are" means of course depends on the use case. The most common cases are probably go to one extreme or the other, with either just one or two expression types needing an override, or all or nearly all needing it.

(One caveat is if you are examining the children of those nodes that have children in a ReadOnlyCollection such as BlockExpression for both its steps and variables or TryExpression for its catch-blocks, and you will only sometimes change those children then if you haven't changed you are best to check for this yourself as a flaw [recently fixed, but not in any released version yet] means that if you pass the same children to Update in a different collection to the original ReadOnlyCollection then a new expression is created needlessly which has effects further up the tree. This is normally harmless, but it wastes time and memory).

查看更多
【Aperson】
3楼-- · 2020-05-21 12:51

The ExpressionVisitor enables the visitor pattern for Expression's.

Conceptually, the problem is that when you navigate an Expression tree, all you know is that any given node is an Expression, but you don't know specifically what kind of Expression. This pattern allows you to know what kind of Expression you're working with and specify type-specific handling for different kinds.

When you have an Expression, you can just call .Modify. The Expression knows its own type, so it'll call back the appropriate override.

Looking at the MSDN example you linked:

public class AndAlsoModifier : ExpressionVisitor  
{  
    public Expression Modify(Expression expression)  
    {  
        return Visit(expression);  
    }  

    protected override Expression VisitBinary(BinaryExpression b)  
    {  
        if (b.NodeType == ExpressionType.AndAlso)  
        {  
            Expression left = this.Visit(b.Left);  
            Expression right = this.Visit(b.Right);  

            // Make this binary expression an OrElse operation instead of an AndAlso operation.  
            return Expression.MakeBinary(ExpressionType.OrElse, left, right, b.IsLiftedToNull, b.Method);  
        }  

        return base.VisitBinary(b);  
    }  
}

In this example, if the Expression happens to be a BinaryExpression, it'll call back VisitBinary(BinaryExpression b) given in the example. Now, you can deal with that BinaryExpression knowing that it's a BinaryExpression. You could also specify other override methods that handle other kinds of Expression's.

It's worth noting that, since this is an overloaded resolution trick, visited Expression's will call back the best-fitting method. So, if there're different kinds of BinaryExpression's, then you could write an override for one specific subtype; if another subtype calls back, it'll just use the default BinaryExpression handling.

In short, this pattern allows you to navigate an Expression tree knowing what kind of Expression's you're working with.

查看更多
我想做一个坏孩纸
4楼-- · 2020-05-21 12:55

There was a issue where on the database we had fields which contained 0 or 1 (numeric), and we wanted to use bools on the application.

The solution was to create a "Flag" object, which contained the 0 or 1 and had a conversion to bool. We used it like a bool through all the application, but when we used it in a .Where() clause the EntityFramework complained that it is unable to call the conversion method.

So we used a expression visitor to change all property accesses like .Where(x => x.Property) to .Where(x => x.Property.Value == 1) just before sending the tree to EF.

查看更多
地球回转人心会变
5楼-- · 2020-05-21 13:12

Specific real world example I have just encountered occurred when shifting to EF Core and migrating from Sql Server (MS Specific) to SqlLite (platform independent).

The existing business logic revolved around a middle tier/ service layer interface that assumed Full Text Search (FTS) happened auto-magically in the background which it does with SQL Server. Search related queries were passed into this tier via Expressions and FTS against an Sql Server store required no additional FTS specific entities.

I didn't want to change any of this but with SqlLite you have to target a specific virtual table for a Full Text Search which would in turn have meant changing all the middle tier calls to re-target the FTS tables/entities and then joining them to the business entity tables to get a similar result set.

But by sub-classing ExpressionVisitor I was able to intercept the calls in the DAL layer and simply rewrite the incoming expression (or more precisely some of the BinaryExpressions within the overall search expression) to specifically handle SqlLites FTS requirements.

This meant that specialization of the datalayer to the data store happened within a single class that was called from a single place within a repository base class. No other aspects of the application needed to be altered in order to support FTS via EFCore and any SqlLite FTS related entities could be contained in a single pluggable assembly.

So ExpressionVisitor is really very useful, especially when combined with the whole notion of being able to pass around expression trees as data via various forms of IPC.

查看更多
登录 后发表回答