Click here to Skip to main content
Click here to Skip to main content

Double-Checked Locking Optimization

, 26 Oct 2009
Rate this:
Please Sign up or sign in to vote.
The Double-Checked Locking Optimization Design Pattern reduces contention and synchronization overheads whenever critical sections of code must acquire locks in a thread-safe manner just once during program execution.

Introduction

This article is related to my previous article: The Strategized Locking Design Pattern. The Double-Checked Locking Optimization Design Pattern reduces contention and synchronization overhead whenever critical sections of code must acquire locks in a thread-safe manner just once during program execution. The Singleton pattern is an excellent example to explain this pattern. However, this pattern is not tied to the Singleton pattern.

Detailed Information

Detailed information can be found in this book: Pattern-Oriented Software Architecture (Volume 2).

There are three major articles related to this pattern. These articles should be read very carefully, and should be considered as a must to read before start implementing this pattern in a multithreaded environment. These references explain the weaknesses of the constructions used within this pattern in further detail.

Detailed Information

The Double-Checked Locking Optimization Pattern (DCLP) is clarified in a multithreaded environment in combination with the Singleton pattern (GoF) and the volatile keyword. I will start with the problem. The problem is very simple. However, the solution with the DCLP which is proposed by Douglas Schmidt is still not safe in a multithreaded environment. Despite the weak spots, the DCLP idea is very interesting.

The DCLP created conversations on the top shelf in the C++ community. Scott Meyers took the pattern of Douglas Schmidt and clarified the weak spots. He not only points out the problems but also explains why. I will try to explain the ideas of Mr. Meyers in this article.

My intention is not to exceed any of these great people, including Mr. Schmidt. My intention is to summarize the DCLP advantages and disadvantages. If you like to read more about the basic Singleton pattern, please stop reading this article. This article is about the Double-Checked Locking Optimization Pattern.

Problem

Here is how we get the instance of the Singleton object:

The header .h file:

static SingletonObject& GetInstance( );

The implementation .cpp file:

SingletonObject& SingletonObject::GetInstance( )
{  
    if( m_instance == NULL )
    {
        m_instance = new SingletonObject( );
    }    
    return *m_instance;
}

The Singleton's constructor shown above can be called multiple times in a multi-threaded application. It will lead to memory leaks. This could be disastrous since you have created multiple instances of the object.

Challenge 1

You can solve this with the help of the Strategized Locking Pattern described here: The Strategized Locking Design Pattern.

The problem is solved by using the Guard before the conditional check. The Singleton is now thread safe. However, you may have created a locking overhead.

The header .cpp file:

static SingletonObject& GetInstance( );

private:
     Lock m_lock;

The implementation .cpp file:

SingletonObject& SingletonObject::GetInstance( )
{  
    Guard< Lock > lock( m_lock );

    if( m_instance == NULL )
    {
        m_instance = new SingletonObject( );
    }    
    return *m_instance;
}

Challenge 2

The locking overhead can be solved by placing the Guard inside the conditional check. The implementation .cpp file:

SingletonObject& SingletonObject::GetInstance( )
{
    if( m_instance == NULL )
    {
        Guard< Lock > lock( m_lock );
         m_instance = new SingletonObject( );
    }    
    return *m_instance;
}

Unfortunately, this solution does not provide thread safe initialization because a race condition in the multi-threaded application can cause multiple initializations of the Singleton. Consider two threads that simultaneously check for instance == NULL. Both will succeed; one will acquire the lock via the Guard, and the other will block. After the first thread initializes the Singleton and releases the lock, the blocked thread will obtain the lock and erroneously initialize the Singleton a second time.

Challenge 3 (Double-Checked Locking Optimization)

The DCLP starts at this point of the problem. As already mentioned before, the DCLP proposed by Douglas Schmidt is still not thread safe. The weakness of this pattern is it uses volatile in order to realize multi-threading safety. Hans Boehm and Nick Maclaren explain in their article when to use and when not to use the volatile keyword. In the case of DCLP, it should not be used. Therefore, this part will wipe away the "thread-safe manner during program execution" of the DCLP. This means the description of DCLP is no longer valid and can be considered as not safe.

To realize the DCLP, the conditional check should be placed after the Guard, and the pointer to the instance should be declared as volatile.

The header .h file:

Class SingletonObject
{
public:
    static SingletonObject* GetInstance( );
    …
private:
    static SingletonObject* volatile m_instance;
}

The implementation .cpp file:

SingletonObject& SingletonObject::GetInstance( )
{
    if( m_instance == NULL )
    {
        Guard< Lock > lock( m_lock );

        if( m_instance == NULL )
        {
             m_instance = new SingletonObject( );
        }
    }    
    return *m_instance;
}

What is volatile?

"The volatile specifier is a hint to a compiler that an object may change its value in ways not specified by the language so that aggressive optimizations must be avoided."

-Bjarne Stroustrup, 3rd edition

Try to Avoid the DCLP

There is a small and tiny drawback using this pattern. However, this tiny drawback can have disastrous consequences for your application. The weakness of this pattern is when the Singleton is just being created in the virtual memory and simultaneously multiple threads are calling the GetInstance method of the Singleton.

Detailed information can be found in the article by Scott Meyers (the C++ master): Scott Meyers and Andrei Alexandrescu.

Example:

The Singleton object is not created yet and the instance is NULL.

Thread 1: is creating the Singleton object (the Singleton is not created in the virtual memory yet) due to optimizing actions taken by the compiler.

m_instance = new SingletonObject( )

However, the pointer to the Singleton is already created and is not NULL.

Thread 2: this thread gets focus, and will not fall through the first conditional check since the pointer is valid. As already mentioned before, the Singleton object is not created in the memory yet and the instance is returned.

Thread 2: will crash using this pointer, since it’s pointing to a memory which is not allocated yet.

Extended DCLP

At this point, Scott Meyers jumps in. He explains that DCLP must be extended with one more volatile. However, this is only valid if the volatile is thread safe. This means even though the extra volatile is added, the problem is still not solved. On the contrary, it can introduce performance issues.

The header .h file:

Class SingletonObject
{
public:
    static SingletonObject* GetInstance( );
    …
private:
    static SingletonObject* volatile m_instance;
}

The implementation .cpp file:

SingletonObject* SingletonObject::GetInstance( )
{
    if( m_instance == NULL )
    {
        Guarder< Lock > lock( m_lock );
       
        if( m_instance == NULL )
        {
            SingletonObject* volatile temp = 
              static_cast< SingletonObject* >(operator new (sizeof(SingletonObject)));
            m_instance = temp;
        }
    }
    return m_instance;
}

In this part, the instance of the Singleton is created in a different way to reduce the aggressive optimization of the compiler.

SingletonObject* volatile temp = 
  static_cast< sSingletonObject* >(operator new (sizeof(SingletonObject)));
m_instance = temp;

Conclusion

The Double-Checked Locking Pattern is not thread safe in combination with the Singleton pattern. Moreover, extending the DCLP with one more volatile will cause more harm than good. In other cases, the implementation of the DCLP should be analyzed very carefully. It is better to avoid it than playing the hero. In the case of the Singleton pattern, make sure the Singleton object is created before it can be accessed by multiple threads. According to Scott Meyers' article, it can only be fixed with memory barriers.

By the way, Scott Meyers is the man! Smile | :)

Thread Safe Singleton Pattern

Codeplug demonstrates this memory barrier. He creates a thread safe Singleton pattern with help of the DCLP.

Special Thanks To

  • Codeplug (Member 4490530).
  • Riced (David R).

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Hamed Ebrahimmalek
Software Developer
Netherlands Netherlands
I'm a C++ and C# .Net software engineer.

Comments and Discussions

 
Answervolatile has nothing to do with multi-threading, at least in the C++03 PinmemberMember 449053021-Oct-09 8:25 
GeneralRe: volatile has nothing to do with multi-threading, at least in the C++03 PinmemberHamed Ebrahimmalek22-Oct-09 2:32 
GeneralRe: volatile has nothing to do with multi-threading, at least in the C++03 PinmemberMember 449053022-Oct-09 4:32 
GeneralRe: volatile has nothing to do with multi-threading, at least in the C++03 PinmemberHamed Ebrahimmalek22-Oct-09 20:42 
GeneralRe: volatile has nothing to do with multi-threading, at least in the C++03 PinmemberCodeplug-gg23-Oct-09 4:27 
[edit]
Note - changed my user name from "Member 4490530" to "Codeplug-gg"
[/edit]
To avoid any confusion, here is the thread-safe implementation of DCL (used for singleton construction) in the code that I linked:
...
    T& Instance()
    {
        // atomic read with acquire membar semantics
        T *p = InterlockedReadAcquire(&m_p);
        if (!p)
        {
            CriticalSection::scoped_lock lock(m_cs);
            p = m_p; 
            if (!p)
            {
                p = new T;
                // atomic write with release membar semantics
                InterlockedWriteRelease(&m_p, p);
            }//if
        }//if

        return *p;
    }//Instance
...
That should look familiar to readers of the Meyers/Alexandrescu paper, because it's an implementation of their only working solution from page 12. The key to a working solution is guaranteeing sequential consistency/visibility of m_p. In other words, any thread that sees a non-zero m_p is guaranteed that m_p is fully assigned and constructed.
 
Also for reference, here are the Interlocked-Read/Write functions for Win32 that I used for that (older) implementation:
/*------------------------------------------------------------------------------
InterlockedReadAcquire -
    Read a "stable" value of *psrc in an Interlocked-Acquire fashion. Returns 
    the true value of the pointer "at the time" this function returns. T should 
    be a POD type no bigger than a pointer.
------------------------------------------------------------------------------*/
template<typename T>
T InterlockedReadAcquire(T *psrc)
{
    T val;
    for (;;)
    {
        val = *psrc;
        void *p = InterlockedCompareExchangePointerAcquire(
                        reinterpret_cast<void**>(psrc), 
                        reinterpret_cast<void*>(val), 
                        reinterpret_cast<void*>(val));
 
        if (reinterpret_cast<T>(p) == val)
            break;
    }//for

    return val;
}//InterlockedReadAcquire

/*------------------------------------------------------------------------------
InterlockedWriteRelease -
    Write a "stable" value of val into *pdst in an Interlocked-Release fashion. 
------------------------------------------------------------------------------*/
template<typename T>
void InterlockedWriteRelease(T *pdst, const T& val)
{
    for (;;)
    {
        *pdst = val;
        void *p = InterlockedCompareExchangePointerRelease(
                        reinterpret_cast<void**>(pdst),
                        reinterpret_cast<void**>(val),
                        reinterpret_cast<void**>(val));
        if (reinterpret_cast<T>(p) == val)
            break;
    }//for
}//InterlockedWriteRelease
===================================================================
 
I'm also happy to show what I use now. Here is my current, generic, Win32 DCL implementation called "Once". As well as my current "SingletonConstruct" class that utilizes it. Since (as you keep pointing out) DCL and Singletons are two separate patterns, I have two separate classes for each:
#include <iostream>
#include <stdexcept>
#include <windows.h>
 
/*------------------------------------------------------------------------------
no_copy - Disallow copy semantics
------------------------------------------------------------------------------*/
class no_copy
{
protected:
    no_copy() {}
    ~no_copy() {}
private:
    no_copy(const no_copy&);
    const no_copy& operator=(const no_copy&);
};//no_copy

/*------------------------------------------------------------------------------
scoped_lock - 
    Generalized template for Acquire()/Release() operations in the constructor 
    and destructor respectively
------------------------------------------------------------------------------*/
template<class sync_t>
class scoped_lock : public no_copy
{
    sync_t &m_sync;
public:
    explicit scoped_lock(sync_t &s) : m_sync(s) {m_sync.Acquire();}
    ~scoped_lock() {m_sync.Release();}
};//scoped_lock

/*------------------------------------------------------------------------------
CriticalSection - C++ version of a Win32 CS
------------------------------------------------------------------------------*/
class CriticalSection : public no_copy
{
    CRITICAL_SECTION m_cs;
 
public:
    typedef scoped_lock<CriticalSection> scoped_lock;
 
    CriticalSection(bool bUseSpinCount = true, DWORD SpinCount = 4000) 
    {
        if (bUseSpinCount && 
            !InitializeCriticalSectionAndSpinCount(&m_cs, SpinCount))
            throw std::runtime_error("InitCSAndSpinCount failure");
        else
            InitializeCriticalSection(&m_cs);
    }//constructor

    ~CriticalSection() {::DeleteCriticalSection(&m_cs);}
    void Acquire() {::EnterCriticalSection(&m_cs);}
    void Release() {::LeaveCriticalSection(&m_cs);}
};//CriticalSection

//------------------------------------------------------------------------------
// Interlocked*Acquire/Release() functions only for Itanium Processor Family,
// (IPF), which we map to non-IPF Interlocked functions for other platforms
#if !defined(_M_IA64) && !defined(__IA64__)
#   define InterlockedExchangeAddAcquire  InterlockedExchangeAdd
#   define InterlockedExchangeAddRelease  InterlockedExchangeAdd
#endif
 
/*------------------------------------------------------------------------------
Once - 
    C++ implementation of "one-time initialization" amount multiple threads. You
    must ensure that construction of a Once object is thread-safe. For example,
    Once definitions at global scope will ensures m_once will be zero'd out
    during global object construction (which is single threaded). You can then
    call one of the DoOnce() methods, or construct a Sentry object on the stack
    to determine if one-time initialization should occur.
 
    This is basically an implementation of double-check locking (DCL), which
    can be tricky to get right - see the following:
 
        "C++ and the Perils of Double-Checked Locking"
        http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf
 
    The proper solution is to ensure proper ordering/visibility of changes to 
    m_once. This is achieved via acquire/release memory barriers, via
    InterlockedExchangeAddAcquire and InterlockedExchangeAddRelease.
------------------------------------------------------------------------------*/
class Once : public no_copy
{
    LONG m_once;
    CriticalSection m_cs;
    friend class Sentry;
 
public:
    class Sentry : public no_copy
    {
        Once &m_o;
        bool m_bOwner;
        bool m_bAcquiredCS;
 
    public:
        explicit Sentry(Once &o) : m_o(o), m_bOwner(false), m_bAcquiredCS(false)
        {
            if (InterlockedExchangeAddAcquire(&m_o.m_once, 0))
                return; // m_once is non-zero, "once" has occurred already

            // "once" hasn't happened yet
            m_o.m_cs.Acquire();
            m_bAcquiredCS = true;
            m_bOwner = (m_o.m_once == 0);
        }//constructor

        ~Sentry()
        {
            if (m_bOwner)
                InterlockedExchangeAddRelease(&m_o.m_once, 1);
            if (m_bAcquiredCS)
                m_o.m_cs.Release();
        }//destructor

        bool OnceOwner() const {return m_bOwner;}
    };//Sentry

    Once() : m_once(0) {}
 
    template <typename Obj_t>
    bool DoOnce(Obj_t *obj, void (Obj_t::*mfn)())
    {
        Sentry s(*this);
        if (s.OnceOwner())
            (obj->*mfn)();
        return s.OnceOwner();
    }//DoOnce

    template<class Functor_t>
    bool DoOnce(const Functor_t &f)
    {
        Sentry s(*this);
        if (s.OnceOwner())
            f();
        return s.OnceOwner();
    }//DoOnce
};//Once

/*------------------------------------------------------------------------------
SingletonConstruct - 
    Template class for constructing a singleton in an efficient and thread safe 
    manner. This object must be a static member of the class that you want to be 
    a singleton. This ensures m_once will be constructed during global object
    construction (which is single threaded).
------------------------------------------------------------------------------*/
template <class T>
class SingletonConstruct : public no_copy
{
    Once m_once;
    T *m_p;
 
public:
    SingletonConstruct() : m_p(0) {}
    ~SingletonConstruct() {Unload();}
 
    // NOTE: caller must guarantee that no references are being held
    void Unload()
    {
        delete m_p;
    }//Unload

    T& Instance()
    {
        Once::Sentry s(m_once);
        if (s.OnceOwner())
            m_p = new T;
        return *m_p;
    }//Instance
};//SingletonConstruct

 
//------------------------------------------------------------------------------
// Example usage of SingletonConstruct
class foo : public no_copy
{
    static SingletonConstruct<foo> m_foo_sc;
    friend class SingletonConstruct<foo>;
 
    foo() {}
    ~foo() {}
 
public:
    static foo& Instance() {return m_foo_sc.Instance();}
    void display() {std::cout << "I'm the foo instance!" << std::endl;}
};//foo

// m_foo_sc is global which ensures it's construction prior to threading
SingletonConstruct<foo> foo::m_foo_sc;
 
//------------------------------------------------------------------------------
int main()
{
    foo &f = foo::Instance();
    f.display();
    return 0;
}//main
The Once::Sentry object constructor and destructor handle the DCL work. It guarantees sequential consistency/visibility of m_once, such that when m_once is non-zero, we are guaranteed that the "once task" has completed. The functionality provided by "Once" is similar to the new Win32 API InitOnceExecuteOnce(), available on Vista and up.
 
I hope that these thread-safe DCL examples will help with your article editing.
 
My opinions on future content of this article:
Remove all the volatile "trickery". The Meyers/Alexandrescu paper covers all that, as well as why it doesn't work.
 
You've come a long way since version 1 of your article. Perhaps you could incorporate into your article all the "lessons learned along the way". I'm hopeful that one of those lessons learned is that (C++) volatile has nothing to do with multi-threaded synchronization. Wink | ;) Unfortunately, that is a very common mistake even among experienced programmers (thus the subject of this thread).
 
gg
GeneralRe: volatile has nothing to do with multi-threading, at least in the C++03 PinmemberHamed Ebrahimmalek23-Oct-09 4:30 
GeneralRe: volatile has nothing to do with multi-threading, at least in the C++03 PinmemberHamed Ebrahimmalek23-Oct-09 4:30 
GeneralRe: volatile has nothing to do with multi-threading, at least in the C++03 PinmemberCodeplug-gg23-Oct-09 6:05 
QuestionWhy wouldn't this work... Pinmemberrmf5212-May-09 7:27 
AnswerRe: Why wouldn't this work... PinmemberHamed Ebrahimmalek20-May-09 1:54 
AnswerRe: Why wouldn't this work... PinmemberRicky Lung26-Oct-09 16:46 
GeneralRe: Why wouldn't this work... Pinmembersupercat927-Oct-09 6:08 
GeneralNot the best option or even a good option. Pinmemberfresi16-Feb-09 21:19 
GeneralRe: Not the best option or even a good option. PinmemberHamed Ebrahimmalek16-Feb-09 22:16 
GeneralRe: Not the best option or even a good option. PinmemberDonsw4-Mar-09 10:17 
GeneralRe: Not the best option or even a good option. PinmemberHamed Ebrahimmalek5-Mar-09 4:45 
GeneralDCL question Pinmemberweaponx2007a16-Feb-09 11:29 
GeneralRe: DCL question Pinmemberaxelriet16-Feb-09 13:19 
GeneralRe: DCL question Pinmemberweaponx2007a17-Feb-09 3:42 
GeneralDouble checked locking considered harmful Pinmemberriced11-Feb-09 5:59 
GeneralRe: Double checked locking considered harmful [modified] PinmemberHamed Ebrahimmalek11-Feb-09 7:30 
GeneralRe: Double checked locking considered harmful Pinmembersupercat913-Feb-09 17:30 
GeneralRe: Double checked locking considered harmful Pinmemberriced14-Feb-09 6:36 
GeneralRe: Double checked locking considered harmful PinmemberHamed Ebrahimmalek16-Feb-09 1:07 
GeneralRe: Double checked locking considered harmful Pinmemberriced16-Feb-09 2:04 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web03 | 2.8.140721.1 | Last Updated 26 Oct 2009
Article Copyright 2009 by Hamed Ebrahimmalek
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid