Click here to Skip to main content
15,867,594 members
Articles / .NET

Demystifying concurrent lazy load pattern

Rate me:
Please Sign up or sign in to vote.
5.00/5 (13 votes)
5 Aug 2012CPOL9 min read 42.8K   581   29   6
Shows what are common mistakes in lazy load implementations and how to implement fast concurrent lazy load cache.

Introduction

Purpose of this article is to show what are common mistakes in lazy load implementations and how to implement fast concurrent lazy load cache. I’m writing this after seeing many incorrect implementations. Hopefully, detailed explanations will help some to better understand concurrent execution of code and how locking fits in.

Article also includes app for simulating different lazy load implementations and their performance. 

If you are sure you know this stuff, check out Code sample 5: good lazy load – just to be sure.

Contents     

Lazy load pattern 

According to Wikipedia:

“Lazy loading is a design pattern commonly used in computer programming to defer initialization of an object until the point at which it is needed. It can contribute to efficiency in the program's operation if properly and appropriately used.” 

I think of lazy load pattern as a simple code convention where some property or method once prompted to return some data, first checks if data is already loaded in some local cache, if not loads it and returns the data to the caller. Cached data is only loaded once and besides the performance gain, there is simplicity and delayed performance impact from the data load. Below is simple example of lazy load implementation of Products.

Code sample 1: simplest lazy load

C#
private IList<Product> localCache;

        
public override IList<Product> GetProducts()
{
    if (this.localCache == null) // LL1
    {
        this.localCache = this.LoadProductsFromDatabase(); // LL2
    }

    return this.localCache; // LL3
}

        
public override void Reset()
{
    this.localCache = null; // R1
}  

Why is your lazy load wrong?

Above is the simplest possible implementation of lazy load, but it has two problems:

  1. It might load data more than once (multi loading)
  2. It is not thread safe

Multi loading 

The table below demonstrates a case when the above code will load data more than once. For simplicity, only two threads are shown, but there could be more simultaneous threads resulting in more than two data loads. Our goal is to load the data only once.

Table 1: Multi loading 

Thread 1 Thread 2
t1 LL1
t2 LL1
t3 LL2 (loads data)
t4 LL2 (second data load)

 

Loading data more than once might not be an issue for some apps but for some it can be fatal. It depends on the data loading speed and the size of the data. Longer data loading is higher is the probability of error occurring and higher will be performance impact. The above example is hiding the fact that LL2 line of code is not atomic and the longer it takes to complete the higher is the chance of multi load occurring. Here is another example of the problem.

Table 2: Second scenario for multi loading

T Thread 1 Thread 2 
t1 LL1 
t2 LL2 (started but cache not initialized) 
t3 LL1 
t4 LL2 (loads data) 
t5 LL2 (finished – second data load) 




 

If your application is such that multiple simultaneous requests on application starting are likely to occur you have a higher chance of experiencing terribly slow (freezing) application cold starts. Luckily, all these problems can easily be solved.

Thread safety

Another problem with above example is that it is not thread safe and that is something that is rarely acceptable. Multithreading bugs are hard because:

  • They are hard to reproduce
  • They cannot be unit tested (at least not easily with deterministic unit tests)
  • Most developers are not good at multithreading

Because of all the above reasons, in my team we are not trying to unit test multithreading issues. We fight them with simple and predictable code and detailed code reviews.

The above example is not thread safe because another thread might reset the cache after the cache was loaded, but before it was returned to the code that requested it. In this case, the code will break or return invalid data depending on the concrete implementation. See following table for a detailed example of the problematic case.

Table 3: Compromised thread safety

T Thread 1 Thread 2 state after 
t1 LL1 null 
t2 LL2 Products 
t3 R1 null 
t4 LL3 null 

 

Naive locking

When people become aware that their code should be concurrent, they know there should be some lock somewhere in code, so they might end up with something like the following.

Code sample 2: naive locking

C#
public IList<Product> GetProducts()
{
    if (this.localCache == null) // LL1
    {
        lock (this.syncLock) // LL2
        {
            this.localCache = this.LoadProductsFromDatabase(); // LL3
        }
    }
 
    return this.localCache; // LL4
}
 
public void Reset()
{
    this.localCache = null;// R1
}  

This code has all the same problems as the previous example, but for now we will just try to solve the multi loading issue. Here is the execution sample showing how multi loading could still occur.

Table 4: Multi load in despite of locking

T Thread 1 Thread 2 Description 
t1 LL1 Thread 1 detects that cache is not loaded 
t2 LL2 Thread 1 acquires lock 
t3 LL3 (started) Thread 1 start loading cache but still didn’t initialize it 
t4 LL1 Thread 2 detects that cache is not loaded 
t5 LL2 Thread 2 waits on lock (held by thread 1) 
t6 LL3 (finished) Thread 2 initializes cache and releases the lock 
t7 LL2 Thread 2 acquires lock 
t8 LL3 Thread 2 loads data (MULTI LOAD) 


 

It should be obvious from the above table where the problem is. While some thread is waiting on the lock because of some condition that same condition might have changed. Here is a corrected example with the IF-LOCK-IF approach.

Code sample 3: IF-LOCK-IF

C#
public IList<Product> GetProducts()
{
    // check if cache is loaded
    if (this.localCache == null)
    { 
        // acquire lock … this could take a while
        lock (syncLock)
        { 
            // important! recheck if cache is loaded
            // while we were waiting for the lock
            if (this.localCache == null)
            { 
                this.localCache = this.LoadProductsFromDatabase();
            } 
        } 
    } 
 
    return this.localCache;
} 
 
public void Reset()
{
    this.localCache = null; // R1
}

The above code solves the problem in moment t8 because Thread 2 detects that the data is initialized and will not load it again.

Remember that when you see IF-LOCK-IF in your code, it is not just repeated code as it seems, there’s a deeper meaning behind it all.

With this sample, we have solved the multi loading issue but our code is still not thread safe because of cache resetting. If you never reset your lazy loaded cache or you are sure no one will access data while it is being refreshed, than you are good to go. 

Locking on data read 

Here is an example I have seen many times in code.

Code sample 4: locking on every data read

C#
public IList<Product> GetProducts()
{
    lock (this.syncLock) 
    {
        if (this.localCache == null) 
        {
            this.localCache = this.LoadProductsFromDatabase();
        }
 
        return this.localCache;
    }
}
 
public void Reset()
{
    lock (this.syncLock)
    {
        this.localCache = null; 
    } 
} 

The irony is that this example is thread safe and will not multi load. However, concurrent code is not same as thread safe code. In above example, all threads do sequential cache access. While Thread 1 reads the cache, Thread 2 waits. The result is that you have the data already loaded in memory just sitting there and you have to wait to read it.

Of course, depending on the cache usage, all these things might not be a problem. For instance, in our example data reading is very fast because GetProducts() only returns reference to a list of products so waiting time is very low. But if the cache had some product search logic integrated in it this might take a while, which would of course additionally lower performance for simultaneous users.

This performance issue can be addressed by using reader lock for accessing data for reading and writer lock for cache resetting. This implementation means that data can be read concurrently, but once cache is being reset all read locks will be blocked. The consequence of this is that performance is not compromised and the implementation is thread safe. On the other hand, we still have multi loading issues. Also, implementations relying on reader/writer lock tend to be more complex and harder for the average developer to understand.

Another thing I do not like in this example is that reset method is now fighting for the lock with other threads that are reading the data. This is very unlikely to be any kind of issue, but still I do not like having code with behavior that is not easy to predict. 

Good example 

It is finally time to show an example of a good implementation. See the comments in the code for a detailed explanation.

Code sample 5: good lazy load 

C#
public IList<Product> GetProducts()
{ 
    // declare local variable for cache access
    // and use it to be safe on sudden cache resetting 
    List<Product> result = this.localCache;
    if (result  == null) 
    {
        lock (this.syncLock)
        {
            // recheck condition after acquiring lock
            // to be sure cache is not loaded while waiting for the lock
            result = this.localCache;
            if (result  == null)
            { 
                // it is important to first store result in local variable 
                // and only after in cache because in between cache 
                // might be reset 
                result = this.LoadProductsFromDatabase(); 
                this.localCache = result; 
            } 
        } 
    }
 
    // important – return local variable and not this.localCache
    // above code guaranties that result will have value
    // on the other hand this.localCache might be reset at any time 
    return result; 
}
 
public void Reset() 
{ 
    // no need for any kind of locking 
    // because our lazy load can handle resetting at any time 
    this.localCache = null; 
}   

This implementation is thread safe. It supports concurrent reading and will not multi load.

The downside is an increase in complexity, but it is still less than 10 lines of code, and if it is confusing your coworkers you can always paste the above comment to your code.

A very important fact is that resetting is very simple and it can occur at any time. This could be very important if you have a cache implementation that will reset on some event or after some time interval that your code does not explicitly control (reset just happens).

Good example using Lazy<T> 

While I was writing this article I found out about Lazy<T> class introduced in .NET Framework 4.0 so I was eager to try and implement same previous code sample using this class and here is the code:

Code sample 6: good lazy load using Lazy<T>

C#
private Lazy<IList<Product>> lazyCache;
 
public Constructor()
{ 
    this.Reset(); 
}
 
public IList<Product> GetProducts() 
{ 
    return this.lazyCache.Value; 
}
 
public void Reset() 
{ 
    this.lazyCache = new Lazy<IList<Product>>(this.LoadProductsFromDatabase); 
} 

Thread safe, supports concurrent reading, will not multi load and all that without conditions and locks. Perfect!

Numbers

I have implemented simple simulator (attached to this article) of all cache implementations mentioned here (the good ones and the bad ones) so I can prove what I have stated in this article. See table below for interesting simulation results I got on my PC.<o:p> 

Note that:

  • Multi loads don’t affect sped of simulator realistically because Thread.Sleep() is used for simulating DB access. This is why simulator scales indefinitely.
  • Only purpose of 5a (Good with 1ms delay) is to compare with 4a (Locking On Data Read with 1ms delay)
  • Only difference between 5c (Custom Lazy<T>) and 5b (Lazy<T>) is that 5c uses my implementation of Lazy<T>. Purpose of 5c is to demonstrate that Lazy<T> doesn’t use any “magic”.


Image 1

 

Table 5: Simulation results

     Cache implementation 

Number of threads

Number of reads per thread   

Time it takes to load cache (ms) 

Reset cache after reads 

Result/Comment 

Simple Lazy Load<o:p>

100 100 100   50 Application breaks. 

Naive Locking<o:p>

100 100  100  -  

Elapsed time: 3,9 sec

Data loads: 39<o:p>  

Data should have been loaded only once.  

If-Lock-If<o:p>

100 100  100  50 Application breaks. 

Locking On Data Read<o:p>

100 100  100  50 

Elapsed time: 25,8 s<o:p>

Data loads: 200<o:p>  

Seem ok, just a bit slower than good implementation (5). But compare 4a and 5a. 
4a 

Locking On Data Read with 1ms delay 

<o:p>

100 100  100  50 

Elapsed time: 120 s<o:p>

Data loads: 200 

Very bad performance. 

Good<o:p>

100 100  100  50 

Elapsed time:19,9 s<o:p>

Data loads: 200 
10.000 10.000 100   

Elapsed time: 42,6 s<o:p>

Data loads: 1 
5a 

Good with 1ms delay 

100 100  100  50 

Elapsed time: 10,4 s<o:p>

Data loads: 102 
5b 

Lazy<T><o:p>

100 100  100  50 

Elapsed time: 11,5 s<o:p>

Data loads: 200 
10.000 10.000 100 

Elapsed time: 35,2 s<o:p>

Data loads: 1 
5c CustomLazy<T> 100  100  100  50 

Elapsed time: 10,4 s<o:p>

Data loads: 200 
10.000 10.000 100 

Elapsed time: 35,3 s<o:p>

Data loads: 1 


   

Conclusion

Caching is something you will probably need in your applications. I hope I have managed to clarify some edge cases and save you from some nasty production bugs.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior) CallidusCloud
Serbia Serbia
I am a software developer at CallidusCloud currently working on software for Quoting and Product Configuration.

In past few years I have been working on development of multi-portal CMS and I was responsible for defining Coding standard and Code Review process. For three years, I have lead team of programmers that developed Soprex framework for enterprise applications development and I have also built Soprex Quotation Tool on that framework. My main points of interests are enterprise app architecture, Scrum and TDD.

I blogs about software development at www.Vukoje.NET.

Comments and Discussions

 
GeneralMy vote of 1 Pin
Member 84469735-Aug-12 23:16
Member 84469735-Aug-12 23:16 
QuestionSome remarks Pin
woutercx4-Aug-12 13:54
woutercx4-Aug-12 13:54 
Thank you for your article, I'm very much interested in this subject at the moment.
I do have a few questions:
- I don't exactly understand the relationship between the table "Table 5: Simulation results" and the rest of the article.
AnswerRe: Some remarks Pin
MVukoje5-Aug-12 9:42
MVukoje5-Aug-12 9:42 
AnswerRe: Some remarks Pin
woutercx5-Aug-12 10:01
woutercx5-Aug-12 10:01 
GeneralRe: Some remarks Pin
MVukoje5-Aug-12 11:44
MVukoje5-Aug-12 11:44 
GeneralRe: Some remarks Pin
woutercx5-Aug-12 12:43
woutercx5-Aug-12 12:43 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.