What's the best way to cache expensive data obtained from reflection? For example most fast serializers cache such information so they don't need to reflect every time they encounter the same type again. They might even generate a dynamic method which they look up from the type.
Before .net 4
Traditionally I've used a normal static dictionary for that. For example:
private static ConcurrentDictionary<Type, Action<object>> cache;
public static DoSomething(object o)
{
Action<object> action;
if(cache.TryGetValue(o.GetType(), out action)) //Simple lookup, fast!
{
action(o);
}
else
{
// Do reflection to get the action
// slow
}
}
This leaks a bit of memory, but since it does that only once per Type and types lived as long as the AppDomain
I didn't consider that a problem.
Since .net 4
But now .net 4 introduced Collectible Assemblies for Dynamic Type Generation. If I ever used DoSomething
on an object declared in the collectible assembly that assembly won't ever get unloaded. Ouch.
So what's the best way to cache per type information in .net 4 that doesn't suffer from this problem? The easiest solution I can think of is a:
private static ConcurrentDictionary<WeakReference, TCachedData> cache.
But the IEqualityComparer<T>
I'd have to use with that would behave very strangely and would probably violate the contract too. I'm not sure how fast the lookup would be either.
Another idea is to use an expiration timeout. Might be the simplest solution, but feels a bit inelegant.
In the cases where the type is supplied as generic parameter I can use a nested generic class which should not suffer from this problem. But his doesn't work if the type is supplied in a variable.
class MyReflection
{
internal Cache<T>
{
internal static TData data;
}
void DoSomething<T>()
{
DoSomethingWithData(Cache<T>.data);
//Obviously simplified, should have similar creation logic to the previous code.
}
}
Update: One idea I've just had is using Type.AssemblyQualifiedName
as the key. That should uniquely identify that type without keeping it in memory. I might even get away with using referential identity on this string.
One problem that remains with this solution is that the cached value might keep a reference to the type too. And if I use a weak reference for that it will most likely expire far before the assembly gets unloaded. And I'm not sure how cheap it is to Get a normal reference out of a weak reference. Looks like I need to do some testing and benchmarking.
ConcurrentDictionary<WeakReference, CachedData>
is incorrect in this case. Suppose we are trying to cache info for type T, soWeakReference.Target==typeof(T)
. CachedData most likely will contain reference fortypeof(T)
also. AsConcurrentDictionary<TKey, TValue>
stores items in the internal collection ofNode<TKey, TValue>
you will have chain of strong references:ConcurrentDictionary
instance ->Node
instance ->Value
property (CachedData
instance) ->typeof(T)
. In general it is impossible to avoid memory leak with WeakReference in the case when Values could have references to their Keys.It was necessary to add support for ephemerons to make such scenario possible without memory leaks. Fortunately .NET 4.0 supports them and we have
ConditionalWeakTable<TKey, TValue>
class. It seems the reasons to introduce it are close to your task.This approach also solves problem mentioned in your update as reference to Type will live exactly as long as Assembly is loaded.
You should check out the fasterflect libary on codeplex http://fasterflect.codeplex.com/
You could use normal reflection to dynamically generate new code & then emit/compile it and then caching the compiled version. I think the collectible assembly idea is promising, to avoid the memory leak without having to load/unload from a separate appdomain. However, the memory leak should be negligible unless you're compiling hundreds of methods.
Here's a blogpost on dynamically compiling code at runtime: http://introspectingcode.blogspot.com/2011/06/dynamically-compile-code-at-runtime.html
Below is a similar concurrent dictionary approach I've used in the past to store the MethodInfo/PropertyInfo objects & it did seem to be a faster, but I think that was in an old version of Silverlight. I believe .Net has it's own internal reflection cache that makes it unnecessary.
I might be stating the obvious here but:
Don't cache providers typically serialise data to a source?
So surely the deserialisation process is going to be more costly than just simply reflecting out a new instance?
Or did i miss something?
And there's the whole argument around boxing and unboxing time costs ... not sure if that really counts though.
Edit:
How about this (hopefully this explains the problem a bit better)...
are you looking to cache "thingy"?