I'm a great fan of the relatively recent trend of using lambda expressions instead of strings for indicating properties in, for instance, ORM mapping. Strongly typed >>>> Stringly typed.
To be clear, this is what I'm talking about:
builder.Entity<WebserviceAccount>()
.HasTableName( "webservice_accounts" )
.HasPrimaryKey( _ => _.Id )
.Property( _ => _.Id ).HasColumnName( "id" )
.Property( _ => _.Username ).HasColumnName( "Username" ).HasLength( 255 )
.Property( _ => _.Password ).HasColumnName( "Password" ).HasLength( 255 )
.Property( _ => _.Active ).HasColumnName( "Active" );
In some recent work I've been doing, I have a need for caching stuff based on the expression and to do that, I needed to create a key based on the expression. Like so:
static string GetExprKey( Expression<Func<Bar,int>> expr )
{
string key = "";
Expression e = expr.Body;
while( e.NodeType == ExpressionType.MemberAccess )
{
var me = (MemberExpression)e;
key += "<" + (me.Member as PropertyInfo).Name;
e = me.Expression;
}
key += ":" + ((ParameterExpression)e).Type.Name;
return key;
}
Notes: The StringBuilder version performs almost identically. It is only supposed to work for expressions that have the form x => x.A.B.C
, anything else is an error and should fail. Yes I need to cache. No, compilation is much slower than key generation/comparison in my case.
While benchmarking various keygen functions, I was mystified to discover that they all performed horribly.
Even the dummy version that just returned ""
.
After some fidling, I discovered that it was the actually the instantiation of the Expression object that was super expensive.
Here is the output of the new benchmark I created to measure this effect:
Dummy( _ => _.F.Val ) 4106,5036 ms, 0,0041065036 ms/iter
Dummy( cachedExpr ) 0,3599 ms, 3,599E-07 ms/iter
Dummy( Bar_Foo_Val ?? (Bar_Foo_Val = _ => _.F.Val) ) 2,3127 ms, 2,3127E-06 ms/iter
And here is the code for the benchmark:
using System;
using System.Diagnostics;
using System.Linq.Expressions;
namespace ExprBench
{
sealed class Foo
{
public int Val { get; set; }
}
sealed class Bar
{
public Foo F { get; set; }
}
public static class ExprBench
{
static string Dummy( Expression<Func<Bar, int>> expr )
{
return "";
}
static Expression<Func<Bar, int>> Bar_Foo_Val;
static public void Run()
{
var sw = Stopwatch.StartNew();
TimeSpan elapsed;
int iterationCount = 1000000;
sw.Restart();
for( int j = 0; j<iterationCount; ++j )
Dummy( _ => _.F.Val );
elapsed = sw.Elapsed;
Console.WriteLine( $"Dummy( _ => _.F.Val ) {elapsed.TotalMilliseconds} ms, {elapsed.TotalMilliseconds/iterationCount} ms/iter" );
Expression<Func<Bar, int>> cachedExpr = _ => _.F.Val;
sw.Restart();
for( int j = 0; j<iterationCount; ++j )
Dummy( cachedExpr );
elapsed = sw.Elapsed;
Console.WriteLine( $"Dummy( cachedExpr ) {elapsed.TotalMilliseconds} ms, {elapsed.TotalMilliseconds/iterationCount} ms/iter" );
sw.Restart();
for( int j = 0; j<iterationCount; ++j )
Dummy( Bar_Foo_Val ?? (Bar_Foo_Val = _ => _.F.Val) );
elapsed = sw.Elapsed;
Console.WriteLine( $"Dummy( Bar_Foo_Val ?? (Bar_Foo_Val = _ => _.F.Val) ) {elapsed.TotalMilliseconds} ms, {elapsed.TotalMilliseconds/iterationCount} ms/iter" );
}
}
}
This clearly demonstrates that a speedup of 2000-10000 times can be achieved with some simple caching.
The problem is, that these workarounds, to varying extent, compromises the beauty and safety of using expressions in this manner.
The second workaround at least keeps the expression inline, but it's far from pretty,
So the questions is, are there any other workarounds that I might have missed, which are less ugly?
Thanks in advance
After thinking on the static caching of properties for a while I came up with this:
In this particular case all the property expressions I was interested in was on simple POCO DB entities. So I decided to make these classes partial and add the static cache properties in another partial pair class.
Having seen that this worked I decided to try and automate it. I looked at T4, but it didn't seem fit for this purpose. Instead I tried out https://github.com/daveaglick/Scripty, which is pretty awesome.
Here is the script I use to generate my caching classes:
An input class could look like this:
The script then generates a file named SomeClassExprs.cs, with the following content:
The files are generated in a folder called codegen, which I exclude from source control.
Scripty makes sure to include the files during compilation.
All in all I'm very pleased with this approach.
:)