I'm a great fan of the relatively recent trend of using lambda expressions instead of strings for indicating properties in, for instance, ORM mapping. Strongly typed >>>> Stringly typed.
To be clear, this is what I'm talking about:
builder.Entity<WebserviceAccount>()
.HasTableName( "webservice_accounts" )
.HasPrimaryKey( _ => _.Id )
.Property( _ => _.Id ).HasColumnName( "id" )
.Property( _ => _.Username ).HasColumnName( "Username" ).HasLength( 255 )
.Property( _ => _.Password ).HasColumnName( "Password" ).HasLength( 255 )
.Property( _ => _.Active ).HasColumnName( "Active" );
In some recent work I've been doing, I have a need for caching stuff based on the expression and to do that, I needed to create a key based on the expression. Like so:
static string GetExprKey( Expression<Func<Bar,int>> expr )
{
string key = "";
Expression e = expr.Body;
while( e.NodeType == ExpressionType.MemberAccess )
{
var me = (MemberExpression)e;
key += "<" + (me.Member as PropertyInfo).Name;
e = me.Expression;
}
key += ":" + ((ParameterExpression)e).Type.Name;
return key;
}
Notes: The StringBuilder version performs almost identically. It is only supposed to work for expressions that have the form x => x.A.B.C
, anything else is an error and should fail. Yes I need to cache. No, compilation is much slower than key generation/comparison in my case.
While benchmarking various keygen functions, I was mystified to discover that they all performed horribly.
Even the dummy version that just returned ""
.
After some fidling, I discovered that it was the actually the instantiation of the Expression object that was super expensive.
Here is the output of the new benchmark I created to measure this effect:
Dummy( _ => _.F.Val ) 4106,5036 ms, 0,0041065036 ms/iter
Dummy( cachedExpr ) 0,3599 ms, 3,599E-07 ms/iter
Dummy( Bar_Foo_Val ?? (Bar_Foo_Val = _ => _.F.Val) ) 2,3127 ms, 2,3127E-06 ms/iter
And here is the code for the benchmark:
using System;
using System.Diagnostics;
using System.Linq.Expressions;
namespace ExprBench
{
sealed class Foo
{
public int Val { get; set; }
}
sealed class Bar
{
public Foo F { get; set; }
}
public static class ExprBench
{
static string Dummy( Expression<Func<Bar, int>> expr )
{
return "";
}
static Expression<Func<Bar, int>> Bar_Foo_Val;
static public void Run()
{
var sw = Stopwatch.StartNew();
TimeSpan elapsed;
int iterationCount = 1000000;
sw.Restart();
for( int j = 0; j<iterationCount; ++j )
Dummy( _ => _.F.Val );
elapsed = sw.Elapsed;
Console.WriteLine( $"Dummy( _ => _.F.Val ) {elapsed.TotalMilliseconds} ms, {elapsed.TotalMilliseconds/iterationCount} ms/iter" );
Expression<Func<Bar, int>> cachedExpr = _ => _.F.Val;
sw.Restart();
for( int j = 0; j<iterationCount; ++j )
Dummy( cachedExpr );
elapsed = sw.Elapsed;
Console.WriteLine( $"Dummy( cachedExpr ) {elapsed.TotalMilliseconds} ms, {elapsed.TotalMilliseconds/iterationCount} ms/iter" );
sw.Restart();
for( int j = 0; j<iterationCount; ++j )
Dummy( Bar_Foo_Val ?? (Bar_Foo_Val = _ => _.F.Val) );
elapsed = sw.Elapsed;
Console.WriteLine( $"Dummy( Bar_Foo_Val ?? (Bar_Foo_Val = _ => _.F.Val) ) {elapsed.TotalMilliseconds} ms, {elapsed.TotalMilliseconds/iterationCount} ms/iter" );
}
}
}
This clearly demonstrates that a speedup of 2000-10000 times can be achieved with some simple caching.
The problem is, that these workarounds, to varying extent, compromises the beauty and safety of using expressions in this manner.
The second workaround at least keeps the expression inline, but it's far from pretty,
So the questions is, are there any other workarounds that I might have missed, which are less ugly?
Thanks in advance