Suggestions for optimizing passing expressions as

2019-04-11 04:47发布

问题:

I'm a great fan of the relatively recent trend of using lambda expressions instead of strings for indicating properties in, for instance, ORM mapping. Strongly typed >>>> Stringly typed.

To be clear, this is what I'm talking about:

builder.Entity<WebserviceAccount>()
    .HasTableName( "webservice_accounts" )
    .HasPrimaryKey( _ => _.Id )
    .Property( _ => _.Id ).HasColumnName( "id" )
    .Property( _ => _.Username ).HasColumnName( "Username" ).HasLength( 255 )
    .Property( _ => _.Password ).HasColumnName( "Password" ).HasLength( 255 )
    .Property( _ => _.Active ).HasColumnName( "Active" );

In some recent work I've been doing, I have a need for caching stuff based on the expression and to do that, I needed to create a key based on the expression. Like so:

static string GetExprKey( Expression<Func<Bar,int>> expr )
{
    string key = "";
    Expression e = expr.Body;

    while( e.NodeType == ExpressionType.MemberAccess )
    {
        var me = (MemberExpression)e;
        key += "<" + (me.Member as PropertyInfo).Name;
        e = me.Expression;
    }

    key += ":" + ((ParameterExpression)e).Type.Name;

    return key;
}

Notes: The StringBuilder version performs almost identically. It is only supposed to work for expressions that have the form x => x.A.B.C, anything else is an error and should fail. Yes I need to cache. No, compilation is much slower than key generation/comparison in my case.

While benchmarking various keygen functions, I was mystified to discover that they all performed horribly.
Even the dummy version that just returned "".

After some fidling, I discovered that it was the actually the instantiation of the Expression object that was super expensive.

Here is the output of the new benchmark I created to measure this effect:

Dummy( _ => _.F.Val ) 4106,5036 ms, 0,0041065036 ms/iter
Dummy( cachedExpr ) 0,3599 ms, 3,599E-07 ms/iter
Dummy( Bar_Foo_Val ?? (Bar_Foo_Val = _ => _.F.Val) ) 2,3127 ms, 2,3127E-06 ms/iter

And here is the code for the benchmark:

using System;
using System.Diagnostics;
using System.Linq.Expressions;

namespace ExprBench
{
    sealed class Foo
    {
        public int Val { get; set; }
    }

    sealed class Bar
    {
        public Foo F { get; set; }
    }


    public static class ExprBench
    {
        static string Dummy( Expression<Func<Bar, int>> expr )
        {
            return "";
        }

        static Expression<Func<Bar, int>> Bar_Foo_Val;

        static public void Run()
        {
            var sw = Stopwatch.StartNew();
            TimeSpan elapsed;

            int iterationCount = 1000000;

            sw.Restart();
            for( int j = 0; j<iterationCount; ++j )
                Dummy( _ => _.F.Val );
            elapsed = sw.Elapsed;
            Console.WriteLine( $"Dummy( _ => _.F.Val ) {elapsed.TotalMilliseconds} ms, {elapsed.TotalMilliseconds/iterationCount} ms/iter" );

            Expression<Func<Bar, int>> cachedExpr = _ => _.F.Val;
            sw.Restart();
            for( int j = 0; j<iterationCount; ++j )
                Dummy( cachedExpr );
            elapsed = sw.Elapsed;
            Console.WriteLine( $"Dummy( cachedExpr ) {elapsed.TotalMilliseconds} ms, {elapsed.TotalMilliseconds/iterationCount} ms/iter" );

            sw.Restart();
            for( int j = 0; j<iterationCount; ++j )
                Dummy( Bar_Foo_Val ?? (Bar_Foo_Val = _ => _.F.Val) );
            elapsed = sw.Elapsed;
            Console.WriteLine( $"Dummy( Bar_Foo_Val ?? (Bar_Foo_Val = _ => _.F.Val) ) {elapsed.TotalMilliseconds} ms, {elapsed.TotalMilliseconds/iterationCount} ms/iter" );
        }
    }
}

This clearly demonstrates that a speedup of 2000-10000 times can be achieved with some simple caching.

The problem is, that these workarounds, to varying extent, compromises the beauty and safety of using expressions in this manner.

The second workaround at least keeps the expression inline, but it's far from pretty,

So the questions is, are there any other workarounds that I might have missed, which are less ugly?

Thanks in advance

回答1:

After thinking on the static caching of properties for a while I came up with this:

In this particular case all the property expressions I was interested in was on simple POCO DB entities. So I decided to make these classes partial and add the static cache properties in another partial pair class.

Having seen that this worked I decided to try and automate it. I looked at T4, but it didn't seem fit for this purpose. Instead I tried out https://github.com/daveaglick/Scripty, which is pretty awesome.

Here is the script I use to generate my caching classes:

using Microsoft.CodeAnalysis.CSharp;
using Microsoft.CodeAnalysis.CSharp.Syntax;
using Scripty.Core;
using System.Linq;
using System.Threading.Tasks;

bool IsInternalOrPublicSetter( AccessorDeclarationSyntax a )
{
    return a.Kind() == SyntaxKind.SetAccessorDeclaration &&
        a.Modifiers.Any( m => m.Kind() == SyntaxKind.PublicKeyword || m.Kind() == SyntaxKind.InternalKeyword );
}


foreach( var document in Context.Project.Analysis.Documents )
{
    // Get all partial classes that inherit from IIsUpdatable
    var allClasses = (await document.GetSyntaxRootAsync())
                    .DescendantNodes().OfType<ClassDeclarationSyntax>()
                    .Where( cls => cls.BaseList?.ChildNodes()?.SelectMany( _ => _.ChildNodes()?.OfType<IdentifierNameSyntax>() ).Select( id => id.Identifier.Text ).Contains( "IIsUpdatable" ) ?? false)
                    .Where( cls => cls.Modifiers.Any( m => m.ValueText == "partial" ))
                    .ToList();


    foreach( var cls in allClasses )
    {
        var curFile = $"{cls.Identifier}Exprs.cs";
        Output[curFile].WriteLine( $@"using System;
using System.Linq.Expressions;

namespace SomeNS
{{
    public partial class {cls.Identifier}
    {{" );
        // Get all properties with public or internal setter
        var props = cls.Members.OfType<PropertyDeclarationSyntax>().Where( prop => prop.AccessorList.Accessors.Any( IsInternalOrPublicSetter ) );
        foreach( var prop in props )
        {
            Output[curFile].WriteLine( $"        public static Expression<Func<{cls.Identifier},object>> {prop.Identifier}Expr = _ => _.{prop.Identifier};" );
        }

        Output[curFile].WriteLine( @"    }
}" );
    }

}

An input class could look like this:

public partial class SomeClass
{
    public string Foo { get; internal set; }
}

The script then generates a file named SomeClassExprs.cs, with the following content:

using System;
using System.Linq.Expressions;

namespace SomeNS
{
    public partial class SomeClassExprs
    {
        public static Expression<Func<SomeClass,object>> FooExpr = _ => _.Foo;
    }
}

The files are generated in a folder called codegen, which I exclude from source control.

Scripty makes sure to include the files during compilation.

All in all I'm very pleased with this approach.

:)