Click here to Skip to main content
Click here to Skip to main content
Go to top

Simple Method Caching

, 12 May 2011
Rate this:
Please Sign up or sign in to vote.
Shows how to create a simple method caching mechanism.

Introduction

As a developer, chances are that you have looked through your code or the result from the profiler and said ..ahh this should be cached .. That ought to fix it and before you know it, you have your typical dictionary with a TryGetValue cluttering up your code.

To me, that is just noise in my code and cross cutting concerns, such as caching should be a one liner. Preferably it should be a zero liner, but that is a little bit harder to achieve, although it most certainly is possible. We will take a look at this later on.

So this article is going to demonstrate how to create a simple caching mechanism for deterministic methods.

Background

First of all, what makes up a good candidate for caching? I would say that it depends on how static our data is. In this case, we talk about deterministic methods, meaning methods that given the same input they will always produce the same result.

Common sense should tell us that these methods only need to be executed once. It's like asking the same question twice and we don't really need to since we already know the answer. Makes sense, doesn't it? The only problem is, methods don't have memory. Yet!!!

Example

In order to better show how this all comes together, we need an example.

public class SampleClass
{
    public int Calculate(int input)
    {
        //Imagine some complex code here 
        //that takes forever to execute.
        return input*2;
    }
}

Given the input of 2, the result from this method will always be 4. Pretty naive, but still proves the point.

Now how could we optimize this method after reading the devastating numbers in the profile report?

This can be done a number of ways, but I do often see code similar to this:

public class SampleClass
{
    private readonly IDictionary<int,int> _cache = 
                     new Dictionary<int, int>();
                
    public int Calculate(int input)
    {
        if (!_cache.ContainsKey(input))
        {
            int result = input*2;
            _cache.Add(input,result);
        }
        return _cache[input];
    }
}

Not very clean, but it takes care of business. That is until the project manager comes up to you and tells you that you can forget about that raise unless you fix that "specified key already exists in the Dictionary" exception. Being an experienced developer, you know a threading bug when you see one and maybe you go ahead and add a lock statement. That way we can make it perform better and at the same time provide a decent level of thread safety.

The only problem now is that most of the code is all about performance and very little about its actual intention. Noise, is what it is.

Adding memory to methods

It would be nice if the Calculate method had a little memory of its own so that it could just return the correct value given that we provide some known input. (Still, we have to imagine here that input * 2 is a very expensive operation.)

To be honest, I have never thought of this until I read this blog post by Denis Markelov.

His idea is to decorate a method in such a way that it returns a Func that points directly back to code not very different from our previous example. Actually, here is the whole class that Denis wrote:

public static class CacheProvider
{ 
    public static Func<T, TResult> Decorate<T, 
                  TResult>(Func<T, TResult> function)
    {
        var cache = new Dictionary<T, TResult>();
        return (x) =>
            {                    
                if (!cache.ContainsKey(x))
                {
                    cache.Add(x, function(x));
                }
                return cache[x];
            };
    }
}

So simple and so brilliant, it almost blew my mind.

All I did with this code was to make it thread safe using the features in .NET 4.0:

public static class CacheProvider
{        
    public static Func<T,TResult> Decorate<T,TResult>(Func<T,TResult> function)
    {
        var cache = new ConcurrentDictionary<T, TResult>();
        return (arg) => cache.GetOrAdd(arg, function(arg));
    }
}

A fully functional caching mechanism with just two lines of code.

Up until this point, I have done nothing but improve Mr. Markelov's code so that it is thread safe.

Let's just quickly see how we can use this code.

public class SampleClass
{        
    private Func<int, int> _cachedCalculate;

    public SampleClass()
    {
        _cachedCalculate = CacheProvider.Decorate<int,int>(DoCalculate);
    }

    public int Calculate(int input)
    {
        return _cachedCalculate(input);
    }

    private int DoCalculate(int input)
    {            
        return input*2;                        
    }
}

In the constructor, we create a Func<int,int> that points back to the code inside the CacheProvider itself.

Works great, but we are still not down to a one-liner. Further improvements are required.

The MethodCache

As we have seen, we still need to deal with some plumbing code to set this up and this can be quite messy if we have a lot of cached methods in our class.

We are going to take a look at the code before trying to explain what it does.

Here is the code for the MethodCache class.

public class MethodCache
{
    private readonly ConcurrentDictionary<Delegate,Delegate> _delegates
        = new ConcurrentDictionary<Delegate, Delegate>();

    public TResult Invoke<T,TResult>(Func<T,TResult> function, T arg)
    {
        return ((Func<T, TResult>) _delegates.GetOrAdd(
                 function, CacheProvider.Decorate(function)))(arg);
    }
}

This is actually sort of a cache itself that keeps a list of decorated methods indexed by the method itself.

You might wonder why this class is not static, and that is because of the fact that we are storing function delegates and then we are also keeping a strong reference to the target of the delegate. We could get around this by implementing a WeakFunction or something in that direction. I actually tried that, but things got messy really fast and I wanted this to be as simple as possible.

If we go back to our SampleClass, we can see this class put to good use.

public class SampleClass
{
    private readonly MethodCache _methodCache = new MethodCache();
        
    public int Calculate(int input)
    {
        return _methodCache.Invoke(DoCalculate, input);
    }

    private int DoCalculate(int input)
    {            
        return input*2;                        
    }
}

As we can see, the type inference mechanism infers the generic arguments from our target method and we are left with a pretty clean one-liner. Since the MethodCache is not static, we still need to declare it, but we use the same MethodCache regardless of the number of cached methods in the class.

Multiple arguments

If we go back to the CacheProvider class, we can observe that it stores the return value based on the input value. Quite simple, but this also keeps us from having more than one method argument. I like to keep the number of arguments in my methods as low as possible, but I occasionally create one or two methods that breaks my principle Smile | :)

Anyhow, since we now are faced with two input values and only one possible key for the dictionary, we need to somehow make them appear as one. For this, we can use the new Tuple class introduced in .NET 4.0. Pretty straightforward, and has served me well on a number of occasions. The CacheProvider's Invoke method now looks like this for the method accepting two arguments:

public static Func<T1,T2, TResult> Decorate<T1,T2, TResult>(Func<T1,T2, TResult> function)
{
    var cache = new ConcurrentDictionary<Tuple<T1,T2>, TResult>();
    return (arg1,arg2) => cache.GetOrAdd(Tuple.Create(arg1,arg2),ck => function(arg1,arg2));
}

We draw the the line here at a maximum of four arguments. Feel free to extend this although it might be wise to consider if more than four arguments is really needed for any method.

To use this, you actually just need to copy the CacheProvider and the MethodCache classes into your own project and you are good to go.

You might want to stop reading here, or you could continue as we shift gears into some really cool stuff.

Transparent caching

Previously, I mentioned how much I would like to implement caching with as little effort as possible. We are now down to a one-liner, but would it be possible to cache-enable our methods without writing any code at all? And what if we wanted to add caching to a library for which we don't have the source code? Is this actually possible?

Let us step back for a moment and take a look at what we are actually doing to enable caching for any given method. This is what we have in our SampleClass.

public int Calculate(int input)
{
    return input*2;
}

And this is what we need:

public int Calculate(int input)
{
    return MethodCache.Invoke(DoCalculate, input);
}

private int DoCalculate(int input)
{
    return input * 2;
}

private MethodCache MethodCache
{
    get { return _methodCache ?? (_methodCache = new MethodCache()); }
}

Let's see what we have done here. First, we have created a private property that gives us access to the MethodCache itself. Next, we have extracted the code from the Calculate method into the DoCalculate method. Finally, we have rewritten the Calculate method so that it now invokes the MethodCache.

Note that the first step is only needed once regardless of the number of cached methods.

Sure, we all have those fancy refactoring tools that makes these kinds of operations a breeze, but it is a simple and repetitive task. Maybe we could delegate all this tedious method extraction and rewriting to someone else. And while you go.. Ah didn't we hire some greenhorn fresh out of school the other day?.. I can tell you that is not whom I'm talking about..

Mono.Cecil

There are essentially just two tools that lets you emit IL (Intermediate Language) instructions into an assembly. The first one is Reflection.Emit and the second one is Mono.Cecil.

So why choose one over the other? The answer to that is really quite simple.

Reflection.Emit can only be used to create new assemblies, while Mono.Cecil has the capability to modify an existing assembly and save it back to disc (or load the modified assembly directly into the app domain).

Since we are dealing with an existing assembly here, we need to use Mono.Cecil to accomplish the task. The process of modifying the IL of an assembly is also known as assembly weaving.

The AssemblyWeaver

The AssemblyWeaver class is a simple class that loads an assembly, weaves the target types, and saves the modified assembly back to disc.

The constructor of AssemblyWeaver takes two parameters:

public AssemblyWeaver(ITypeWeaver typeWeaver, ITypeSelector typeSelector)
{
    Guard.IsNotNull(typeWeaver, "typeWeaver");
    Guard.IsNotNull(typeSelector, "typeSelector");
    _typeWeaver = typeWeaver;
    _typeSelector = typeSelector;
}

The first parameter is an implementation of the ITypeWeaver interface that does the actual weaving of each type. The second parameter is an implementation of the ITypeSelector interface that is responsible for selecting which types to send to the ITypeWeaver. You might recognize this style of coding as Dependency Injection. There are default implementations of both interfaces available and we will take a look at both of them.

The ITypeSelector

The ITypeSelector interface represents a class that is responsible for selecting the types that should be weaved. The interface itself is quite simple.

public interface ITypeSelector
{
    /// <span class="code-SummaryComment"><summary>
</span>    /// Returns a list of types from the target <span class="code-SummaryComment"><paramref name="moduleDefinition"/>. 
</span>    /// <span class="code-SummaryComment"></summary>
</span>    /// <span class="code-SummaryComment"><param name="moduleDefinition">The module that contains the target types.</param>
</span>    /// <span class="code-SummaryComment"><returns>The list of selected types.</returns>
</span>    IEnumerable<TypeDefinition> Select(ModuleDefinition moduleDefinition);
}

You can then choose to implement the interface yourself or use the provided implementation.

Given that the target assembly is our own and we have the source code for it, we could decorate the target types with an attribute that makes the type eligible for weaving.

To help identify the target types and methods, there is a EnableMethodCachingAttribute class that is used to decorate types.

So we have a AttributeTypeSelector that selects only the types decorated with this attribute.

/// <span class="code-SummaryComment"><summary>
</span>/// A <span class="code-SummaryComment"><see cref="ITypeSelector"/> implementation that selects 
</span>/// types decorated with the <span class="code-SummaryComment"><see cref="EnableMethodCachingAttribute"/> 
</span>/// <span class="code-SummaryComment"></summary>
</span>public class AttributeTypeSelector : LambdaTypeSelector
{        
    /// <span class="code-SummaryComment"><summary>
</span>    /// Initializes a new instance of the <span class="code-SummaryComment"><see cref="AttributeMethodSelector"/> class.
</span>    /// <span class="code-SummaryComment"></summary>
</span>    public AttributeTypeSelector()
        : base(t =>
            {
                TypeReference attributeType = 
                    t.Module.Import(typeof (EnableMethodCachingAttribute));
                return t.IsDefined(attributeType);
            }){}                
}

As we can see, this class inherits from the LambdaTypeSelector class which is a class that selects the target types based on a function delegate.

public class LambdaTypeSelector : ITypeSelector
{
    private readonly Func<TypeDefinition, bool> _selector;

    /// <span class="code-SummaryComment"><summary>
</span>    /// Initializes a new instance of the <span class="code-SummaryComment"><see cref="LambdaTypeSelector"/> class.
</span>    /// <span class="code-SummaryComment"></summary>
</span>    /// <span class="code-SummaryComment"><param name="selector">The function delegate
</span>    ///     used to select the target types.<span class="code-SummaryComment"></param>
</span>    public LambdaTypeSelector(Func<TypeDefinition,bool> selector)
    {
        _selector = selector;
    }

    /// <span class="code-SummaryComment"><summary>
</span>    /// Returns a list of types from the target <span class="code-SummaryComment"><paramref name="moduleDefinition"/>. 
</span>    /// <span class="code-SummaryComment"></summary>
</span>    /// <span class="code-SummaryComment"><param name="moduleDefinition">The module that contains the target types.</param>
</span>    /// <span class="code-SummaryComment"><returns>The list of selected types.</returns>
</span>    public IEnumerable<TypeDefinition> Select(ModuleDefinition moduleDefinition)
    {
        return moduleDefinition.Types.Where(_selector);
    }
}

This class can come in handy if we are weaving assemblies for which we don't have the source code, or that we for some reason don't want to "clutter" our code with attributes.

The TypeWeaver

This is the class that actually performs the rewriting of the target types. The TypeWeaver class takes one constructor argument and that is an implementation of the IMethodSelector interface.

public TypeWeaver(IMethodSelector methodSelector)        
{
    _methodSelector = methodSelector;
}

The IMethodSelector implementation is responsible for selecting the methods that are eligible for weaving.

There is a default implementation of this interface as well, called the AttributeMethodSelector class that works almost in the same way as the AttributeTypeSelector class apart from the fact that we are now looking for the CachedAttribute class.

So what we are looking for now are types like this:

[EnableMethodCaching]
public class SampleClass
{                
    [Cached]
    public int Calculate(int input)
    {
        return input*2;
    }                
}

Note that the AttributeTypeSelector could be changed into looking at all types that contain at least one method decorated with the CachedAttribute, but that might be a time consuming operation, so that is actually the reason for the EnableMethodCachingAttribute in the first place.

So what does this TypeWeaver actually do? Well, it should do exactly the same thing as the greenhorn now sitting next to you asking for network credentials.

  • Create a property that represents the MethodCache instance
  • Extract the code in the target method into a new private method
  • Rewrite the target method so that it calls the MethodCache.Invoke method

Now, how hard could it be? Surprisingly, it is not that difficult; however, working with Cecil can be challenging at times. Mono.Cecil is probably one of the most powerful libraries available on the .NET platform, but on the other hand, I will put this kindly and say that its documentation is scarce.

When trying to figure out how to emit code, it is best to implement what you need in plain C# and then investigate the IL using ildasm.exe. Personally, I use Reflector or LinqPad to do this, but ildasm will do just fine.

So given this code:

private MethodCache MethodCache
{
    get
    {
        if (_methodCache == null)
            _methodCache = new MethodCache();
        return _methodCache;
    }
}

This is the IL we need to emit:

.property instance class [MethodCaching]MethodCaching.MethodCache Method
{
    .get instance class [MethodCaching]MethodCaching.MethodCache 
                   SampleLibrary.TemplateClass::get_Method()
}

The TemplateClass is a class written to have something to reflect over to get the IL. Anyway, not much code here and that is because standard CLR properties are just wrappers around getter and setter methods, so we need to go to the get_Method to get the actual implementation.

.method private hidebysig specialname instance class 
          [MethodCaching]MethodCaching.MethodCache get_Method() cil managed
{
    .maxstack 2
    .locals init (
        [0] class [MethodCaching]MethodCaching.MethodCache cache,
        [1] bool flag)
    L_0000: nop 
    L_0001: ldarg.0 
    L_0002: ldfld class [MethodCaching]MethodCaching.MethodCache 
                   SampleLibrary.TemplateClass::_methodCache
    L_0007: ldnull 
    L_0008: ceq 
    L_000a: ldc.i4.0 
    L_000b: ceq 
    L_000d: stloc.1 
    L_000e: ldloc.1 
    L_000f: brtrue.s L_001c
    L_0011: ldarg.0 
    L_0012: newobj instance void [MethodCaching]MethodCaching.MethodCache::.ctor()
    L_0017: stfld class [MethodCaching]MethodCaching.MethodCache 
                   SampleLibrary.TemplateClass::_methodCache
    L_001c: ldarg.0 
    L_001d: ldfld class [MethodCaching]MethodCaching.MethodCache 
                   SampleLibrary.TemplateClass::_methodCache
    L_0022: stloc.0 
    L_0023: br.s L_0025
    L_0025: ldloc.0 
    L_0026: ret 
}

That's more like it. This is exactly the code that we need to emit in order to have a property that returns a MethodCache instance. You can see that there is a lot of IL just to express something that in C# seemed quite simple. As a matter of fact, it can be optimized by, for instance, removing some of the stloc and ldloc instructions that the compiler seems to throw in without any real purpose.

But be careful; if in doubt, emit exactly what ildasm outputs or else you run the risk that the evaluation stack becomes imbalanced and that is definitely not a good thing.

I am not going to elaborate on the details of how to emit this code using Cecil. It's pretty straightforward and you will find all the IL rewriting code inside the TypeWeaver class.

PostWeaving

PostWeaving is a term that is used for assemblies that get weaved just after they have been compiled. So we need something that we can execute from an MSBuild file. Let us create a simple console application (PostWeaver.exe) to serve this purpose.

namespace PostWeaver
{
    class Program
    {
        static void Main(string[] args)
        {
            var assemblyWeaver = new AssemblyWeaver(
                  new TypeWeaver(new AttributeMethodSelector()),
                  new AttributeTypeSelector());
            assemblyWeaver.Weave(args[0]);
        }
    }
}

Next, we need to make sure that PostWeaver.exe gets executed after compilation.

In our case, we are going to postweave the SampleLibrary and we enter the following macro for the post-build event command line (Project -> Properties -> Build Events)vents):

"$(SolutionDir)\PostWeaver\bin\$(ConfigurationName)\PostWeaver.exe" "$(TargetPath)"

Conclusion

Using this tiny library, we should be able to take of a lot of our caching needs and whether we choose to weave our assemblies or just use the MethodCache directly, it is still pretty simple to use.

We have demonstrated how IL rewriting can be used to implement cross cutting concerns. And remember, the fastest way to execute a method is not to execute it at all!!

Points of interest

NDC Talk Preview: “Ten Simple Rules for Rewriting IL on the Common Language Runtime” by Philip Laureano.

A special thanks goes out to Gábor Kozár for teaching me how to create an instance of a generic type with Cecil. And of course, Denis Markelov who got me thinking about this in the first place.

History

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

seesharper
Software Developer
Norway Norway
I'm a 39 year old software developer living in Norway.
I'm currently working for a software company making software for the retail industry.
Follow on   Twitter

Comments and Discussions

 
GeneralInteresting PinmemberCIDev13-Jun-11 5:04 
GeneralRe: Interesting Pinmemberseesharper22-Jun-11 20:27 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web02 | 2.8.140921.1 | Last Updated 12 May 2011
Article Copyright 2011 by seesharper
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid