Lock Manager for .NET

SZotin

3.57/5 (10 votes)

Aug 12, 2003

13 min read

85344

1039

Deadlock resolver for muti-threading applications.

Download source files - 37.7 Kb

Introduction

Multithreaded applications need to lock shared resources to prevent other threads accessing them in inconsistent state. Locking means allowing access to some critical program section to only one thread. Other threads that need to access locked section must wait until the first thread releases the lock. .NET framework provides a wide range of locking methods, like Monitor, Mutex, ReaderWriterLock, etc. C# even has a language construction lock(…), which is actually a wrapper around Monitor.Enter/Monitor.Exit.

However, Microsoft doesn’t provide any solution for resolving deadlocks. Generally, we can describe deadlock as a state of the thread when it’s alive but cannot run due to outstanding resource request, which is never granted. Although now we are going to discuss only one special (but the most frequent) class of deadlocks, when threads block each other by locking needed recourses in a cycle manner. For example: thread 1 locks object A and tries to lock object B; but object B is locked by thread 2, which is waiting to lock object A. Both threads cannot go on. The cycle may be longer, like: thread 1 locks A and waits for B; thread 2 locks B and waits for C; thread 3 locks C and waits for A.

LockManager is an attempt to provide a runtime solution for resolving deadlocks. It is a thread synchronization class, which behaves pretty mush like System.Threading.Monitor. It has a pair of static functions Lock/Unlock that work like Monitor.Enter/Monitor.Exit. Moreover, they are compatible to Monitor’s functions. LockManager.Lock blocks Monitor.Enter in another thread and vise versa, Monitor.Enter blocks LockManager.Lock if they are in separate threads.

The difference is that LockManager.Lock function throws DeadlockException in case of deadlock, while Monitor.Enter simply hangs. Additionally LockManager can write some state info about a deadlock into log-files (or console). State info includes a deadlock history and stack traces for all involved threads. For example:

---------------------------------------------------------------------
Deadlock history:
Thread 'X' locks the object 'A'
Thread 'Y' locks the object 'B'
Thread 'X' waits for the object 'B'
Thread 'Y' waits for the object 'A'
Deadlock collector interrupts thread 'X' to resolve the deadlock
---------------------------------------------------------------------
Stack trace for the thread 'X':
   at System.Environment.GetStackTrace(Exception e)
   at System.Environment.GetStackTrace(Exception e)
   at System.Environment.get_StackTrace()
   at ZEN.Threading.LockManager.Lock(Object obj) in 
                          ...\lockmanager.cs:line 694
   at B.set_X(Int32 value) in ...\Step3.cs:line 182
   at A.set_X(Int32 value) in ...\Step3.cs:line 85
   at Step3.RunX() in ...\Step3.cs:line 254
---------------------------------------------------------------------
Stack trace for the thread 'Y':
   at System.Environment.GetStackTrace(Exception e)
   at System.Environment.GetStackTrace(Exception e)
   at System.Environment.get_StackTrace()
   at ZEN.Threading.LockManager.addEvent(EventType type, 
                Object obj) in ...\lockmanager.cs:line 273
   at ZEN.Threading.LockManager.Lock(Object obj) 
                    in ...\lockmanager.cs:line 712
   at A.set_Y(Int32 value) in ...\Step3.cs:line 118
   at B.set_Y(Int32 value) in ...\Step3.cs:line 220
   at Step3.RunY() in ...\Step3.cs:line 264

LockManager is quite useful if you have an application experiencing problems with deadlocks. LockManager can help you to find and analyze them. All you have to do is to replace all lock(…){} or Monitor.Enter/Monitor.Exit occurrences with LockManager.Lock/LockManager. Unlock and attach your log-file to LockManager.

Another usage is to design your program without caring too much about the cascading locks and their order. Using LockManager, you can catch DeadlockException; restore the state of your thread after a deadlock and continue execution. This is not an easy task, and I’m going to discuss it in details later in this article.

Implementation

Every time Lock/Unlock is called LockManager saves state info to its internal array. A special thread called Deadlock Collector analyses these structures to find deadlocks. It interrupts one of the involved threads to resolve the deadlock and interrupts the thread.

The thread catches ThreadInterruptedException exception (in function Lock), and checks if it was interrupted by Deadlock Collector. If yes, it throws DeadlockException; if no, re-throws ThreadInterruptedException.

Timeouts and thread-grouping techniques are not used. Timeouts work slow and cause some hard-to-fix problems (I would like to leave these problems out of scope of this discussion). Thread-grouping techniques look much more promising, and I’m thinking of implementing them in future versions (I don’t want to discuss them here either).

Some thoughts

Before continuing and describing the usage of LockManager, I would like to make two points:

Before adding multithreading and synchronization to you application first ask yourself, do you really need it? If the answer is yes, ask yourself again. If you have even the slightest chance to avoid multithreading in your application, don’t miss it. If not, never say I didn’t prompt you and welcome to hell.
Theoretically, I think, deadlocks not always should (or even can) be fixed at programming time. The best example is database engines. They cannot predict the future usage of data and thus cannot prevent deadlocks in concurrent usage. In big complicated real-life projects, nobody can be smart enough to analyze all possible deadlock situations and prevent them. Moreover, projects use external libraries, which are not guaranteed to be deadlock-free. I believe that the deadlock problem should be fixed on CLR (or VM for Java) level. CLR has garbage collector. Why not to have deadlock collector (or deadlock monitor as it is called in MS SQL Server)? The deadlock problem is not simpler than memory leaks.
Although either CLR or Java VM doesn’t do this, I decided to design my own class for resolving deadlocks (I actually needed it for my implementation of data access objects). However, it is helpless against deadlocks in library code, and against other than cycled-type deadlocks.

Limitations

We do resolve only the simplest kind of deadlock when thread1 locks object1 and waits for object2; thread2 locks object2 and waits for object3; …; threadN locks objectN and waits for object1.

All other deadlocks, like caused by thread priorities, waits that never pulse, communication deadlocks etc, are out of the scope of LockManager.

However, I want to prompt you, that this version of LockManager has a significant limitation. Sometimes it may cause a deadlock itself. This situation is quite rare, and the odds that someday you’ll face it are very low. Anyway, be prompted. The problem in its simplest incarnation is:

Suppose we have three threads 1, 2 and 3.

Thread 1 locks object A and waits for object B.
Thread 2 locks object B and waits for object A.
Thread 3 waits for object B.
Deadlock Collector finds the deadlock between 1 and 2 and resolves it by interrupting 2. Thread 2 catches deadlock exception releases all locks and attempts to retry.
Thread 3 gets control, locks B and waits for A.
Deadlock Collector finds the deadlock between 1 and 3 and resolves it by interrupting 3.
Thread 2 gets control, locks B and waits for A.
Etc infinitely…

The latest version of LockManager doesn’t have this limitation (I don’t want to call it ‘bug’), but I don’t distribute it freely (although I didn’t decide how to distribute it yet). You can contact me at szotin@shaw.ca (Sergei Zotin) if you are interested in it. Additionally newest version can provide statistic info about locks: how much time in average are your threads waiting on locks, how often your application locks objects etc…

Interface

static void Lock ( object obj )
Locks the object obj. Lock is fully compatible with Monitor.Enter. It actually uses Monitor.Enter inside.
static void Unlock ( object obj )
Unlocks the object obj. Unlock is fully compatible with Monitor.Exit. It actually uses Monitor.Exit inside.
static int CollectorSleepTimeout
Normally Deadlock Collector is sleeping. When Lock finds out that there is a risk of a deadlock, it wakes Deadlock Collector up. If there is no risk of a deadlock for a long time, Deadlock Collector wakes up itself using timeout. CollectorSleepTimeout can be used to get or set this timeout (in milliseconds). It is 10000 (10 sec) by default.
static TextWriter Log. The one can assign TextWriter to LockManager to receive deadlock traces.
static int LockCount
How many locks current thread did so far. Note that only LockManager locks are counted. LockManager doesn’t have any info about Monitor.Enter calls, or Mutex.WaitFor, or lock(…), etc…

Using LockManager

The usage is not as simple as it may seem. The main question is: what to do with DeadlockException if it is thrown? The simplest answer is: to finish the thread. It is good enough if you use LockManager only to analyze deadlocks in your application. However, if you want your application to be stable and reliable, you should think of redesigning to a deadlock-free state, or find a way to survive a thread after a deadlock. Depending on a task ‘survive’ may mean many things, like repeating requests after releasing all locks, rolling back current transaction, run time analysis of the state of the thread, etc.

Now let’s discuss some steps of creating deadlock-free application. I hope these steps will give you some ideas how to apply LockManager to your own applications.

Step1: deadlocking application

Suppose we have two classes A and B. They have references to each other, and these references are not null. Both of them have private integer members x and y. Both of them provide access to these members through the properties X and Y. Suppose there are integrity rules:

When we set A.X, referenced B.X should be set to the same value.
When we set B.X, referenced A.X should be left ‘as is’.
When we set B.Y, referenced A.Y should be set to the same value.
When we set A.Y, referenced B.Y should be left ‘as is’.

In the other words:

class A
{
 private B b;
 private int x;
 private int y;

 public X
 {
  get { return x; }
  set { b.X = x = value; }
 }

 public Y
 {
  get { return y; }
  set { y = value; }
 }
}

class B
{
 private A a;
 private int x;
 private int y;

 public X
 {
  get { return x; }
  set { x = value; }
 }

 public Y
 {
  get { return y; }
  set { a.Y = y = value; }
 }
}

Now we want to make classes A & B thread safe:

class A
{
 private B b;
 private int x;
 private int y;

 public X
 {
  get { lock ( this ) { return x; } }
  set { lock ( this ) { b.X = x = value; } }
 }

 public Y
 {
  get { lock ( this ) { return y; } }
  set { lock ( this ) { y = value; } }
 }
}

class B
{
 private A a;
 private int x;
 private int y;

 public X
 {
  get { lock ( this ) { return x; } }
  set { lock ( this ) { x = value; } }
 }

 public Y
 {
  get { lock ( this ) { return y; } }
  set { lock ( this ) { a.Y = y = value; } }
 }
}

When we are trying to access them from two different threads, we get the deadlock. For example:

Thread 1	Thread 2
`a.X = 0;`	`b.Y = 0;`
`lock ( a )`	`lock ( b )`
`a.x = 0;`	`b.y = 0;`
`b.X = 0;`	`a.Y = 0;`
`lock ( b ) (wait because it’s locked by thread 2)`	`lock ( a ) (wait because it’s locked by thread 1)`
deadlock

See ‘steps’ in downloaded package for the complete implementation.

Step2: Adding LockManager

Application at Step1 simply hangs in case of a deadlock. It doesn’t give you any idea how could that happen, which classes and threads were involved. There are no chances to revive the application.

As a first step of making it a little bit friendlier, we can simply change all entries of lock(…) to LockManager.Lock().

class A
{
…
 public X
 {
  get
 {
  LockManager.Lock ( this );
  try
  {
   return x;
  }
  finally
  {
   LockManager.Unlock ( this );
  }
 }

  set
  {
   LockManager.Lock ( this );
   try
   {
    b.X = x = value;
   }
  finally
  {
   LockManager.Unlock ( this );
  }
  }
 }

… // the same for Y
}

// and the same for B

Now instead of hanging, one of the threads throws DeadlockException and probably crashes (we don’t catch exceptions here), but another thread survives and continues execution.

Additionally you can assign log-file to LockManager (or console, like: LockManager.Log = System.Console.Out) and see deadlock traces. Analyzing these traces may significantly help you to redesign your application to deadlock-free.

Step 3: Resolving deadlocks

Step2 application doesn’t hang, it traces deadlock info, but still one of the threads fails. Is it possible to restore this thread and continue execution? The answer is yes. All we need is catching exception inside set_X, releasing all locks and retrying the operation:

class A
{
…
 public X
 {
…
  set
  {
   bool retry;
   do
   {
    retry = false;
   LockManager.Lock ( this );
    try
    {
     b.X = x = value;
   }
    catch ( DeadlockException exc )
    {
     retry = true;
   }
   finally
   {
    LockManager.Unlock ( this );
   }
  } while ( retry );
  }
 }

…
}

// the same for B

set_X will keep trying to assign b.X until it succeeds. Both threads continue execution. Both work properly, as if there is no such thing as deadlock. Bingo.

Now I want to discuss a question: wasn’t it better to redesign Step2 the way it doesn’t have deadlocks at all? Theoretically, yes, but… I see two possible steps of redesigning and both have issues.

1. Changing cascading locks to sequencing.

class A
{
…
 public X
 {
 …
  set
  {
   lock ( this )
   {
    x = value;
   }
  b.X = value;
  }
 }
…
}

The problem is that there is a moment when we already unlocked A, but didn’t lock B yet. Another thread may gain control at this moment and see our objects in inconsistent state: A.x is assigned, while B.x is not. This is a big integrity issue, which may cause unpredictable results in some cases.

2. Locking in the same order.

class A
{
…
 public X
 {
 …
  set
  {
   lock ( this )
   {
    b.X = x = value;
   }
  }
 }
…
}

class B
{
…
 public Y
 {
 …
  set
  {
   lock ( a )
   {
    a.Y = value;
    lock ( this )
    {
     y = value;
    }
   }
  }
 }
…
}

This code doesn’t have any real problems, but it has some disadvantages:

B uses some internal knowledge of A implementation. It actually knows which object A uses for synchronization. Every time you change A, you have to decide if B requires changes. Internal behavior of A is not its own business any more.
The order of locks should be well documented and considered every time you make changes to A and B.
It becomes extremely hard to keep controlled, when you have more than two lock-objects, or even arrays of lock-objects, locking them in a loop. The model we discuss is quite simple: A references B; B references A. What if A references B and C; B references C and D; C references D and E; D references E and A? Are you ready to design it the way all locks go in the same order? If not, consider the LockManager. It’s not easy too, but at least you can keep implementations isolated from each other.

Step4: Bug fixes

Step3 does what it supposed to do, but there are two small bugs. We are going to fix them at this step.

1. Integrity issue.

Let’s trace through A.set_X:

   bool retry;
   do
   {
    retry = false;
   LockManager.Lock ( this );
    try
    {
     b.X = x = value;

The equivalent of the last line is:

     x = value;
     b.X = x;

Suppose b.X = x causes DeadlockException.

    }
    catch ( DeadlockException exc )
    {
     retry = true;

We set retry to true and…

   }
   finally
   {
    LockManager.Unlock ( this );
   }

Release the lock.

Now x is assigned to value, while b.X is not. At the same time, all locks are released. Another thread may get the control and find our objects inconsistent.

To prevent it we need to undo the assignment x = value before releasing the lock:

  set
  {
   int oldX = x;
   bool retry;
   do
   {
    retry = false;
    LockManager.Lock ( this );
    try
    {
     b.X = x = value;
    }
    catch ( DeadlockException exc )
    {
     x = oldX;
     retry = true;
    }
    finally
    {
     LockManager.Unlock ( this );
    }
   } while ( retry );
  }

2. External locking issue

What if one of the threads has the next code:

…
LockManager.Lock ( a );
try
{
 a.X = whatever;
}
finally
{
 LockManager.Unlock ( a );
}

Suppose deadlock happens inside a.X = whatever. Function A.set_X catches the exception, releases all locks and retries. However, it doesn’t know about the external lock from the thread. It is still locked preventing other threads from waking up. Deadlock is still there.

Here is the fix:

  set
  {
   int oldX = x;
   bool retry;
   do
   {
    retry = false;
    LockManager.Lock ( this );
    try
    {
     b.X = x = value;
    }
    catch ( DeadlockException exc )
    {
     x = oldX;
 
     // if this is the last lock - retry;
     // else - rethrow - it should be retried on
     // an upper level
     if ( LockManager.LockCount == 1 )
      retry = true;
     else
      throw exc;
    }
    finally
    {
     LockManager.Unlock ( this );
    }
   } while ( retry );
  }

Step5: Optimizations

This is pretty much it. All what stays is to make some performance notes. What can be improved in Step4?

1. LockManager is needed only for cascading locks

If your class has some simple properties, which are never involved in cascading locks, you can use common locks instead of LockManager. Common locks work much faster. However, be aware, if you make mistake nobody will resolve the deadlock for you.

class A
{
 private B b;
…
 public B TheB
 {
  // It is never involved in cascade locks, 
  // so deadlock cannot be a problem.   
  // We can use lock(this) instead of 
  // LockManager.Lock(this), 
  // which works much faster.
  get
  {
   lock ( this )
   {
    return b;
   }
  }
  set
  {
   lock ( this )
   {
    b = value;
   }
  }
 }
…
}

2. Prevent undoing by using correct order

If we assign B.X = value first and x = value next (opposite order to what we did in set_X), there is no reason to undo x = value operation.

  set
  {
   bool retry;
   do
   {
    retry = false;
    LockManager.Lock ( this );
    try
    {
     // if b.X = value throws an exception
     // we don’t need to undo x = value
     // because we didn’t do it yet
     b.X = value;
     x = value;
    }
    catch ( DeadlockException exc )
    {
     if ( LockManager.LockCount == 1 )
      retry = true;
     else
      throw exc;
    }
    finally
    {
     LockManager.Unlock ( this );
    }
   } while ( retry );
  }

3. Private lock object

You can make many assumptions that help you with the performance, if you use private lock object instead of this. We’ll consider the actual assumptions that may help you in the next items. For now let’s just make a point that private object in most cases is better than this.

class A
{
 private B b;

 // By using private lock object instead of this, 
 // we guarantee that users of A
 // will not take part in our synchronization game.
 // That allows us to make some assumptions 
 // about a state of locks.
 private string lockObject = "lock object";
…
 public B TheB
 {
  // It is never involved in cascade locks, 
  // so deadlock cannot be a problem.   
  // We can use lock(this) instead of LockManager.Lock(this), 
  // which works much faster.
  get
  {
   lock ( lockObject )
   {
    return b;
   }
  }
  set
  {
   lock ( lockObject )
   {
    b = value;
   }
  }
 }
…
}

4. Excluding LockCount

Now we use private lock object, so we can be sure that external components do not lock either A or B. At the same time we never use X internally, so when it is called, either A or B are never locked. This means that even if LockManager.LockCount is not 1, we can fix the deadlock right here. External locks don't take part in it.

LockManager.LockCount is a time consuming operation (it makes some loops throw internal LockManager arrays, which themselves are synchronized). Avoiding it definitely improves performance.

  set
  {
   bool retry;
   do
   {
    retry = false;
    LockManager.Lock ( lockObject );
    try
    {
     b.X = value;
     x = value;
    }
    catch ( DeadlockException )
    {
     // Now LockCount is not checked – not necessarily 
     retry = true;
    }
    finally
    {
     LockManager.Unlock ( lockObject );
    }
   } while ( retry );
  }

Mixing Monitor and LockManager code

Finally, I want to discuss how to mix common locks with LockManager in the same scope. Sometimes that may be very useful for improving performance. However, you always have to remember that LockManager has no idea about locks made using other methods. OK, I’m tired writing so many words. Just read the code:

 bool retry;
 do
 {
  retry = false;
  lock ( lockObject )
  {
   ... // do something internal.
   if ( need_cascading_locks )
   {
    // oh, we need to lock some other objects
    // tell lock manager that lockObject is locked
    LockManager.Lock ( lockObject );
    try
    {
     // lock other objects (using LockManager) 
     // and do whatever you need to do.
     …
    }
    catch ( DeadlockException )
    {
     retry = true;
    }
    finally
    {
     LockManager.Unlock ( lockObject );
    }
   }
  }
 } while ( retry );

The trick is that we don’t call Lock unless it’s really needed. Common lock works much faster. Therefore, if it appears that cascading locks are not required, we simply don’t pay the price.

Acknowledgments

I would like to thank Andrew Fedoniouk and Boris Kropivnitsky for substantial discussions.

In addition, special thanks to Ilya Solnyshkin for giving me the idea to publish this stuff and constant support in all my undertakings.