Click here to Skip to main content
Click here to Skip to main content

A simple Win32 readers/writers lock with reentrance

By , 23 Jan 2006
 

Introduction

The code is a simple implementation of a readers/writers lock that supports reentrance and lock escalation, i.e. a thread holding a read lock can request and is granted write access provided that no other thread is holding the read lock and a thread holding a write lock is granted read lock access.

Background

The Windows synchronization primitives do not include support for locking readers and writers. Sometimes, it is useful to allow read access to multiple threads without the threads needing to read data, not having to block just because other threads are reading too. The risk of data corruption arises only when the data is altered. Write access must be exclusive (to other writers and to any reader) but read access can be shared between readers. Allowing multiple reader threads to share the lock allows for greater concurrency and reduces the risk of deadlocks. The existing implementations that I could find didn't support reentrancy which was key to avoiding deadlocks in the application I was working on.

Using the code

The code is straightforward to use, either call ClaimReader/ClaimWriter and later ReleaseReader/ReleaseWriter, or use the auto-lock classes AutoLockReader and AutoLockWriter:

class ReadWriteLockTLSRelease {
public:
      class TimeoutExpiredException : std::exception {};
      class ImplicitEscalationException : std::exception {};
private:
      volatile int numReaders;
      int numWriters;
      CRITICAL_SECTION atomizer;
      HANDLE spinEvent;
      // If the timeout exipres it's not a big deal, 
      // as the ClaimXX() function rechecks if it can 
      // claim lock
      static const int WaitSpinTimeout=1000; 
      static const int MaxSpinIterations=4000;
      static const int MaxSpinIterationsWrite=4000;
      int writerThreadId;
      int tlsSlot;
      int anyThreadWaiting;
      
      class EscalatingPolicyAllow
      {
      public:
            static bool AllowImplicitEscalation()
            {
                  return true;
            }
      };
      class EscalatingPolicyDeny
      {
      public:
            static bool AllowImplicitEscalation()
            {
                  return false;
            }
      };
      __forceinline int GetTLSReaderCount()
      {
            return (int)(INT_PTR)TlsGetValue(tlsSlot);
      }
      __forceinline void SetTLSReaderCount(int count)
      {
            TlsSetValue(tlsSlot, (LPVOID)(INT_PTR)count);
      }
      __forceinline void SpinThreads()
      {
            // Unreliable but in case of failure the 
            // timeout rescues us
            if(anyThreadWaiting>0) PulseEvent(spinEvent); 
      }
      template<class EscalatingPolicy> 
           inline bool CheckUpgradingFromReaderToWriter()
      {
            if(numReaders==0) return false;
            int readerCount=GetTLSReaderCount();
            if(readerCount>0) {
               // exit read lock
               if(!EscalatingPolicy::AllowImplicitEscalation()) 
                     throw ImplicitEscalationException();
               SetTLSReaderCount(-readerCount);
               while(true) {
                  int old=numReaders;
                  if(old==InterlockedCompareExchange((LONG*)&numReaders, 
                                            numReaders-readerCount, old))
                      break;
               }
               return true;
            }
            return false;
      }
      inline void CheckRestorePreviousReaderLock()
      {
            int previous=-GetTLSReaderCount();
            if(previous>0) {
                  SetTLSReaderCount(previous);
                  numReaders=previous;
            }
      }
      inline void IncrementReaderCount()
      {
            SetTLSReaderCount(GetTLSReaderCount()+1);
      }
      __forceinline void Spin()
      {
            Sleep(0);
      }
      __forceinline void AcquireReader()
      {
         _ASSERT(numReaders>=0);
         InterlockedIncrement((LONG*)&numReaders);
         IncrementReaderCount();
         _ASSERT(numWriters==0);
                //||(numWriters>0&&writerThreadId==myThreadId));
         LeaveCriticalSection(&atomizer);
      }
      __forceinline void AcquireWriter(int myThreadId)
      {
            numWriters++;
            _ASSERT(numReaders==0);
            writerThreadId=myThreadId;
            LeaveCriticalSection(&atomizer);
      }
      class TimeoutIgnore
      {
      public:
            __forceinline void CheckExpired(int timeout) const {}
      };
      class TimeoutChecker
      {
            unsigned long ticksAtStart;
      public:
            TimeoutChecker() 
            {
                  ticksAtStart=timeGetTime();
            }
            void CheckExpired(int timeout)
            {
                  if(int(timeGetTime()-ticksAtStart)>timeout) 
                      throw TimeoutExpiredException();
            }
      };
      template<class TimeoutPolicy> 
           __forceinline void ClaimReaderInternal(int timeout)
      {
            _ASSERT(numReaders>=0);
            int myThreadId=GetCurrentThreadId();
            // Grant read access if thread already 
            // has write access
            if(myThreadId==writerThreadId) return; 
            int old=numReaders;
            if(old>0) {
              if(old==InterlockedCompareExchange((LONG*)&numReaders, 
                                                         old+1, old)) 
              {
                  _ASSERT(numReaders>=0);
                  IncrementReaderCount();
                  return;
              }
            }
            TimeoutPolicy t;
            for(int i=0; i<MaxSpinIterations; ++i)
            {
                  if(numWriters==0) {
                        EnterCriticalSection(&atomizer);
                        if(numWriters==0) {
                              AcquireReader();
                              return;
                        }
                        LeaveCriticalSection(&atomizer);
                  }
                  t.CheckExpired(timeout);
                  Spin();
            }
            while(true) {
                  EnterCriticalSection(&atomizer);
                  if(numWriters==0) break;            
                  InterlockedIncrement((LONG*)&anyThreadWaiting);
                  LeaveCriticalSection(&atomizer);
                  t.CheckExpired(timeout);
                  WaitForSingleObject(spinEvent, WaitSpinTimeout);
                  InterlockedDecrement((LONG*)&anyThreadWaiting);
            }
            AcquireReader();
      }
      template<class TimeoutPolicy, class EscalatingPolicy> 
              __forceinline void ClaimWriterInternal(int timeout)
      {
            _ASSERT(numReaders>=0);
            int myThreadId=GetCurrentThreadId();
            TimeoutPolicy t;
            for(int i=0; i<MaxSpinIterationsWrite; ++i) {
                  EnterCriticalSection(&atomizer);
                  
                  if(numWriters==1&&myThreadId==writerThreadId) {
                        // Reentering write lock
                        AcquireWriter(myThreadId);
                        return;
                  }
                  
                  CheckUpgradingFromReaderToWriter<EscalatingPolicy>();
                  if(numReaders==0&&numWriters==0) {
                        AcquireWriter(myThreadId);
                        return;
                  }
                  LeaveCriticalSection(&atomizer);
                  t.CheckExpired(timeout);
                  Spin();
            }
            while(true) {
                  EnterCriticalSection(&atomizer);
                  
                  CheckUpgradingFromReaderToWriter<EscalatingPolicy>();
                  if(numReaders==0&&numWriters==0) {
                        AcquireWriter(myThreadId);
                        return;
                  }
                  t.CheckExpired(timeout);
                  InterlockedIncrement((LONG*)&anyThreadWaiting);
                  LeaveCriticalSection(&atomizer);
                  WaitForSingleObject(spinEvent, WaitSpinTimeout);
                  InterlockedDecrement((LONG*)&anyThreadWaiting);
            }
      }
public:
      ~ReadWriteLockTLSRelease()
      {
            anyThreadWaiting=0;
            DeleteCriticalSection(&atomizer);
            CloseHandle(spinEvent);
            TlsFree(tlsSlot);
      }
      ReadWriteLockTLSRelease()
      {
            InitializeCriticalSection(&atomizer);
            numReaders=numWriters=0;
            spinEvent=CreateEvent(NULL,TRUE,FALSE,NULL);
            // The slot default value is 0 (NULL) for each 
            // thread (not clearly documented)
            tlsSlot=TlsAlloc(); 
            if(tlsSlot==TLS_OUT_OF_INDEXES) 
                throw std::exception("Out of TLS slots");
      }
      void ClaimWriterAllowEscalating()
      {
            return ClaimWriterInternal<TimeoutIgnore, 
                        EscalatingPolicyAllow>(INFINITE);
      }
      void ClaimWriterAllowEscalating(int timeout)
      {
            return ClaimWriterInternal<TimeoutChecker, 
                         EscalatingPolicyAllow>(timeout);
      }
      void ClaimWriterNoEscalating()
      {
            return ClaimWriterInternal<TimeoutIgnore, 
                         EscalatingPolicyDeny>(INFINITE);
      }
      void ClaimWriterNoEscalating(int timeout)
      {
            return ClaimWriterInternal<TimeoutChecker, 
                          EscalatingPolicyDeny>(timeout);
      }
      void ClaimReader()
      {
            return ClaimReaderInternal<TimeoutIgnore>(INFINITE);
      }
      void ClaimReader(int timeout)
      {
            return ClaimReaderInternal<TimeoutChecker>(timeout);
      }
      void ReleaseWriter()
      {
            _ASSERT(numReaders>=0);
            EnterCriticalSection(&atomizer);
            _ASSERT(numWriters>0);
            numWriters--;
            if(0==numWriters) writerThreadId=0;
            CheckRestorePreviousReaderLock();
            SpinThreads();
            LeaveCriticalSection(&atomizer);
      }
      
      void ReleaseReader()
      {
            _ASSERT(numReaders>=0);
            EnterCriticalSection(&atomizer);
            int myThreadId=GetCurrentThreadId();
            // if numWriters>0 I am also have the writer lock
            if(numWriters==0) 
            {
                  InterlockedDecrement((LONG*)&numReaders);
                  SetTLSReaderCount(GetTLSReaderCount()-1);
                  _ASSERT(numReaders>=0);
                  _ASSERT(numWriters==0);
                  SpinThreads();
            }
            LeaveCriticalSection(&atomizer);
      }
};

Points of interest

When a thread requests for a write lock after having obtained a read lock, the reader lock is released and the thread waits until there are no readers and the write lock is granted. This is needed to avoid deadlocks caused by two threads holding the read lock and simultaneously requesting for a write lock. The release and reacquire upon lock escalation can possibly lead to data corruption as the info that the thread may have gotten while holding the reader lock may not be consistent with the state of the data protected by the lock after the write lock is granted, as other writer threads might have gotten hold of the lock while the thread was upgrading his lock from read to write. To avoid this, the upgraded thread must be conscious of the fact that after having requested for the write lock it has to reacquire info about the data protected by the lock it needs for its write operation. If this behavior is not desired, ClaimWriterNoEscalating() can be used, in which case the lock throws an ImplicitEscalationException exception if it finds that the thread already holds the read lock.

Performance

I've run a simple benchmark program (included in the source example code) and the result is shown in the following graph:

The fully reentrant lock is about 3 times slower than a simple Windows critical section, which is not too bad. It's faster than .NET ReaderWriterLock. The biggest performance gain comes from avoiding the use of heavy-weight OS synchronization primitives and instead using the active wait plus Sleep(0).

History

  • 1-10-06
    • First release.
  • 1-12-06
    • Measured .NET ReaderWriterLock.
  • 1-20-06
    • Use of spin lock and thread local storage for 8x speedup, timeouts, release reader lock when escalating with option to allow or disallow implicit escalation.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

Flavio.Antonioli
Italy Italy
Member
No Biography provided

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
Generaldowngrade from writer to readermemberalexander k9 Feb '06 - 6:28 
since upgrading can produce a deadlock (or its not atomic) why not try following. First aquire write lock, then check if we need write lock. If not downgrade to read lock and read data. Or don't downgrade and write data.
 

GeneralRe: downgrade from writer to readermemberFlavio.Antonioli14 Feb '06 - 1:24 
I believe that acquiring the write lock (even if only temporarily) when the caller wants only to read would mean that only one reader at a time would be able to acquire the lock (in order to acquire the write lock no thread must have a reader lock). That would effectively render the lock a simple critical section.
GeneralSorry to burst your bubble, but...memberJames Ivie19 Jan '06 - 2:04 
Sorry to burst your bubble, but there is a reason you can't just upgrade a reader-writer lock. If two threads call ClaimReader and then call ClaimWriter at the same time they will deadlock EVERY time (both have a reader lock and attempt to get a writer lock without releasing their reader lock--both have to wait for each other to finish before they can proceed). The .NET documentation on this is misleading (actually the name of the function itself is misleading), probably because they didn't realize this when they initially wrote it. When "upgrading", the .NET reader-write lock actually RELEASES the read lock before acquiring the writer lock--meaning that it wasn't really "upgraded" at all, it is EXACTLY the same as a call to ReleaseReaderLock followed by AcquireWriterLock, except that everyone thinks that it switched to a writer lock without releasing the reader lock first. When you call UpgradeToWriterLock in .NET, it is entirely possible that another writer acquired thew lock (and CHANGED the data you were protecting) BEFORE UpgradeToWriterLock returns. If you google for it, you'll see where the MONO guys ran into this and had to fix their implementation to match Microsoft's.

GeneralRe: Sorry to burst your bubble, but...memberStephen Muccione19 Jan '06 - 4:01 
umm...
 
looking at his code, it only allows for an upgrade should there be a single reader which would eliminate the deadlock issue...
GeneralRe: Sorry to burst your bubble, but...memberJames Ivie19 Jan '06 - 5:38 
Yes, but look at what happens when the calling thread ISN'T the only reader. It keeps going around the loop until it is, or until it acquires the lock (which it NEVER can until it is the only reader).
GeneralRe: Sorry to burst your bubble, but...memberFlavio Antonioli19 Jan '06 - 7:03 
Thanks for the comments. I'm testing an implementation that uses TLS and releases the reader lock when escalating from read to write lock. I though about TLS but never having used it before I was worried to open another can of worms.
My system uses so far just one reader/writer lock so the limitation on the number of TLS slots isn't an issue for me.
 
The execution time is approx. 4800 ms, so there's a 20% gain compared to the non-TLS version. The biggest drain on performance is probably the SetEvent()/WaitForSingleObject() calls that trap into the kernel. During the execution task manager reports that most CPU time is spent in kernel mode (approx 60% kernel and 30-35% user).
 

class ReadWriteLockTLSRelease {
CRITICAL_SECTION atomizer;
int numReaders;
int numWriters;
HANDLE spinEvent;
static const int WaitSpinTimeout=1000; // If the timeout exipres it's not a big deal, as the ClaimXX() function rechecks if it can claim lock
int writerThreadId;
int tlsSlot;
__forceinline int GetTLSReaderCount()
{
return (int)(INT_PTR)TlsGetValue(tlsSlot);
}
__forceinline void SetTLSReaderCount(int count)
{
TlsSetValue(tlsSlot, (LPVOID)(INT_PTR)count);
}
__forceinline void SpinThreads()
{
PulseEvent(spinEvent); // Unreliable but in case of failure there's always the wait timeout
}
inline bool CheckUpgradingFromReaderToWriter()
{
if(numReaders==0) return false;
int readerCount=GetTLSReaderCount();
if(readerCount>0) {
// exit read lock
SetTLSReaderCount(-readerCount);
numReaders-=readerCount;
return true;
}
return false;
}
inline void CheckRestorePreviousReaderLock()
{
int previous=-GetTLSReaderCount();
if(previous>0) {
SetTLSReaderCount(previous);
numReaders=previous;
}
}
public:
~ReadWriteLockTLSRelease()
{
DeleteCriticalSection(&atomizer);
CloseHandle(spinEvent);
TlsFree(tlsSlot);
}
ReadWriteLockTLSRelease()
{
InitializeCriticalSection(&atomizer);
numReaders=numWriters=0;
spinEvent=CreateEvent(NULL,TRUE,FALSE,NULL);
tlsSlot=TlsAlloc(); // The slot default value is 0 (NULL) for each thread (not clearly documented)
if(tlsSlot==TLS_OUT_OF_INDEXES) throw std::exception("Out of TLS slots");
}
void ClaimReader()
{
int myThreadId=GetCurrentThreadId();
while(true) {
EnterCriticalSection(&atomizer);
if(numWriters>0) {
// Grant read access if thread already has write access
if(myThreadId==writerThreadId) {
LeaveCriticalSection(&atomizer);
return;
}
} else //if(numWriters==0)
break;
LeaveCriticalSection(&atomizer);
WaitForSingleObject(spinEvent, WaitSpinTimeout);
}
_ASSERT(numReaders>=0);
numReaders++;
SetTLSReaderCount(GetTLSReaderCount()+1);
_ASSERT(numWriters==0||(numWriters>0&&writerThreadId==myThreadId));
LeaveCriticalSection(&atomizer);
}
void ReleaseWriter()
{
EnterCriticalSection(&atomizer);
_ASSERT(numWriters>0);
numWriters--;
CheckRestorePreviousReaderLock();
SpinThreads();
LeaveCriticalSection(&atomizer);
}
void ClaimWriter()
{
int myThreadId=GetCurrentThreadId();
while(true) {
EnterCriticalSection(&atomizer);

if(numWriters==1&&myThreadId==writerThreadId)
break; // Reentering write lock

CheckUpgradingFromReaderToWriter();
 
if(numReaders==0&&numWriters==0)
break;
LeaveCriticalSection(&atomizer);
WaitForSingleObject(spinEvent, WaitSpinTimeout);
}
numWriters++;
_ASSERT(numReaders==0);
writerThreadId=myThreadId;
LeaveCriticalSection(&atomizer);
return;
}
void ReleaseReader()
{
EnterCriticalSection(&atomizer);
int myThreadId=GetCurrentThreadId();
if(numWriters==0) // if numWriters>0 I am also have the writer lock
{
numReaders--;
SetTLSReaderCount(GetTLSReaderCount()-1);
_ASSERT(numReaders>=0);
_ASSERT(numWriters==0);
SpinThreads();
}
LeaveCriticalSection(&atomizer);
}
};

GeneralRe: Sorry to burst your bubble, but...memberStephen Muccione19 Jan '06 - 7:13 
ok, I still don't see what the problem is... that is infact the desired behaviour. It's a single write, multiple read lock system and as such should wait until all readers have released their locks so the write can complete. This would be the same case in the .NET situation irregardless if it is checking that it is the only reader or no readers.
 
The check for a single reader simply allows you to not have to release the writers read-lock before it claims the write lock. Irregardles of this optimzation it would still have to claim the writelock which, by definition requires all other read-locks to have been released.
 
On a side note: The automatic upgrade to a readlock does indeed cause all the difficulties with his multiset and my TLS optimization to the origional code. There really is no need to do auto-upgrades. All the code need do is release the read-lock, claim the write lock and set a flag to indicate that a read-lock should be reaquired upon release of the writelock. This is the optimal solution as it would then allow others in the write-lock queue to get their turn.
GeneralRe: Sorry to burst your bubble, but...memberJames Ivie19 Jan '06 - 7:49 
This would work, but is setting a trap for users of the class. The trap is that users are still "holding" a read lock, but the data the lock is protecting can get changed anyway. The code below is an illustration of just such a problem (though it gets FAR more complicated when you have multiple nested function calls). I've used my own reader-writer locks for 10 years now, and the only solution I've come up with is to make it an error to call AcquireWriterLock when a read lock is active on the current thread (ie. throw an exception if this occurs). See the following psuedo-code:
 

RWLock lock;
int protectedValue;
int protectedValue2;
 
int function1()
{
    ReadLock rl (lock);
    if (protectedValue != 10)
    {
        return 1;
    }
    function2();
    return protectedValue;
}
 
void function2()
{
    WriteLock w1 (lock);
    if (++protectedValue2 == 0)
    {
        ++protectedValue;
    }
}

 
The return value from function1 is not necessarily 1 or 10, but you could never know that without looking at the details of function2, which makes the ReadLock totally deceptive--it is only a "lock" as long as no other functions that might get a write lock are called. When you have a system with 15+ levels of function calls in it, that's effectively impossible to determine.
 
While this illustration isn't particularly useful code, I have run into real scenarios where this is a problem on several occasions.

GeneralRe: Sorry to burst your bubble, but...memberStephen Muccione19 Jan '06 - 9:29 
Granted.
 
You could never guarantee that any bit of code is self consistant. Not allowing upgrades would certainly solve the stated problem. However, there are ceratinly a class of programs (state driven rather for example, rather then deeply nested) which may benefit from an upgrade ability (granted, however it shouldn't be any different, and probably more paintainable if an explicit unlock/lock was done, or at a minimum use a function with "upgrade" in it's name).
GeneralRe: Sorry to burst your bubble, but...memberJames Ivie19 Jan '06 - 11:34 
I agree, except that "upgrade" is a bad name because it is misleading. That's the term the .NET function uses and the even the internal documentation writer didn't understand it well enough to point out the trap. (The documentation doesn't mention anything about unlocking first, though the sample code would deadlock if it didn't unlock first--not that anyone who didn't fully understand the problem would notice that). The MONO guys also misunderstood what the function did and implemented it incorrectly until a tester pointed out the problem. I suspect that the MS guy who designed the API at first didn't understand the issue and by the time they figured it out, the API had already been released so they couldn't change it.
GeneralRe: Sorry to burst your bubble, but...memberFlavio Antonioli20 Jan '06 - 6:57 
I've posted an update to the article that adds a ClaimWriterNoEscalation() method that throws an exception if the thread already has the read lock. ClaimWriterAllowEscalating() instead lets the "implicit" escalation happen and releases the read lock before waiting for the write lock. I'll try using the latter method in my application as it should be fine provided that the thread that gets the write lock knows that some other thread may have gotten hold of the lock while it was waiting for the lock escalation to take place.
GeneralRe: Sorry to burst your bubble, but...memberJames Ivie19 Jan '06 - 8:19 
The problem isn't the optimization per se, it's the entire concept of switching a reader lock to a writer lock without first releasing the reader lock. It is theoretically impossible to do such a thing without causing deadlock. Let me illustrate with a code walkthrough:
 
1. Let's start with threads 1 and 2 both having called ClaimReader.
2. Next, both threads call ClaimWriter.
3. Both threads get their thread id.
4. One of the threads (the winner) wins the race to get the critical section (it doesn't matter since both are exactly the same).
5. The winner thread checks to see if it is the only reader. It isn't, the loser thread is also a reader.
6. The winner thread checks to see if it is already a writer. It isn't.
7. The winner thread checks to see if there are no other active readers or writers. There are. The loser thread is an active reader.
8. The winner thread unlocks the critical section (without having changed the state in any way!).
9. The winner thread waits for the spin event.
10. At this point it is likely that the loser thread acquires the critical section, although it doesn't matter.
11. The new winner goes back to step 5.
 
There is no way out of the loop. Not only are the threads deadlocked, but they are consuming some CPU going around the loop over and over, which to many developers would make it appear as if deadlock has NOT occured.

GeneralRe: Sorry to burst your bubble, but...memberArne Vogel20 Jun '08 - 3:59 
Implementation issues aside there is no general solution that works in all cases and there never will be. The deadlock is avoidable but the irreconcilable contention is not. When I implemented a read/write lock some time ago, I introduced a third type of lock which was an upgradable (I called it "escalatable") read lock. The idea is that the following types of concurrency are supported
 
1 thread holds write lock
 
or
 
0/1 thread holds upgradable read lock, any number of threads hold plain read lock
 
This is incidentally exactly the way boost::shared_mutex works. The advantage is expensive read/modify/write operations need not block plain readers until they are actually performing the write. At the same time, it is guaranteed that an upgradable read lock can be converted to a write lock atomically (barring starvation), so there is no risk that the expensive operation suddenly fails half-way through because of contention.
 
My implementation additionally offered upgrade operations for plain read locks, but those could fail. Instead of deadlocking, however, the code would detect this condition and throw an exception in the losing (IOW late) thread. There are some corner cases where the best approach isn't obvious. For example, if a reader (thread A) tries to upgrade its lock, but there is already another thread B with a guaranteed upgradable lock, the function could fail immediately. But maybe thread B is actually going to release the lock without upgrading. In this case, it would have been better to just block A until B releases its lock. The optimum strategy depends on how probable it is for thread B to actually upgrade.
GeneralHaving to keep track of reader thread IDs is expensive but I couldn't think of a better solutionmemberStephen Muccione18 Jan '06 - 2:05 
Ok... if I understand your code, what you really need to know is if the thread requesting the write lock also has a read lock. This will be requested WITHIN the thread requesting the write lock.
 
So the answer to your problem is actually quite simple and extremly fast.
 
During object creation just allocate a Thread Local Storage slot (TlsAlloc)and during ClaimReader simply do a TlsSetValue ( m_TlsReadSlot, TlsGetValue( m_TlsReadSlot) + 1 ).
 
During ReleaseReader do a TlsSetValue ( m_TlsReadSlot, TlsGetValue ( m_TlsReadSlot) - 1 );
 
Now during the ClaimWriter method you simply need to do an
 
if ( TlsGetValue( m_TlsReadSlot ) == numReaders )
 
in place of the ThereIsASingleReaderAndItIsMe() call.
 
That should be much, much faster then using a multiset (The TLS functions end up being only a few instructions).
 
steve
GeneralRe: Having to keep track of reader thread IDs is expensive but I couldn't think of a better solutionmemberJames Ivie19 Jan '06 - 2:12 
That works if you know you will never have more than a few reader/writer locks, but you can't count on an arbitrary number of TLS slots being available--there is a limit, it isn't very large, and if you are writing this in a DLL, you have no way of knowing how many TLS slots other DLLs in the process have used until you run out.
GeneralRe: Having to keep track of reader thread IDs is expensive but I couldn't think of a better solutionmemberStephen Muccione19 Jan '06 - 3:25 
Alright sure, it's over 1K on windows NT systems but if you think you need more there's always other ways...
 
It's certainly possible to have a single TLS value and simply allocate an array per thread containing the maximum number of r/w locks you'll need.... (manage them in any number of ways)... heck, make the array a few million big if you want and just use the vm functions and don't commit the memory.
 
The speed increase in switching to using TLS data to indicate the threads read state rather then keeping track using a multiset with calls to new/malloc and it's OWN calls to the TLS functions is probably deserving of another rev, even if it's just a special case for people needing only a small number of r/w locks.
GeneralRe: Having to keep track of reader thread IDs is expensive but I couldn't think of a better solutionmemberJames Ivie19 Jan '06 - 6:33 
Yes. I believe a proper implementation could rely on getting *one* TLS slot, because that would fail when the DLL is loaded instead of at some random point later. I wouldn't advise making the arrays "a few million big" because you would very quickly run out of virtual address space. Physical memory would be fine, but there IS a limit of 2GB of virtual address space and many systems in use today are already pushing up against that boundary. If you have 50 threads using R/W locks (this is reasonable in many systems--you're using R/W locks so you must have more than 2 threads!), and used 2 million 4-byte values for each thread to track R/W lock access, that's 400MB of virtual address space which is a very significant chunk. A proper implementation would definitely have to be more complicated. If you're not worried about an implementation for general use, this type of simplified system is fine (except for the deadlock problem mentioned in my other thread).
QuestionBenchmark?memberamdopteron16 Jan '06 - 9:34 
Hello,
 
one question:
How did you create the benchmark results? Does there exist any tool for benchmarking unmanaged C++ programs?
 
Amd Opteron

AnswerRe: Benchmark?memberFlavio Antonioli17 Jan '06 - 3:43 
The results in the graph are just the execution times of the demo project (click on "Download demo project" at the top of the page). To test the different locks change the typedef at the bottom of ReaderWriterLock.h
 
Flavio.

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web02 | 2.6.130516.1 | Last Updated 23 Jan 2006
Article Copyright 2006 by Flavio.Antonioli
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid