Click here to Skip to main content
Click here to Skip to main content

Debugging multithreaded code in real time!

, 3 Jul 2014 Public Domain
Rate this:
Please Sign up or sign in to vote.
None blocking, multithreading safe tracing with only microseconds delays.

Introduction

Writing a multithreaded application is much more difficult than a single threaded program. Even worse, one needs to debug in real time to find these bugs, which is nearly impossible. But fear not, this article explains how just that can be simply done !

Challenge

We all love Visual Studio, using its breakpoints and single stepping through code to find out why a program behaves differently than expected. Alas, setting a break point or single stepping will completely change the behaviour of a multithreaded application, where it matters which thread executes which instruction in which sequence, measured in microseconds or less. Stop or delay anything in the multithreaded system and it behaves completely differently.

So obviously, we cannot stop a single thread when debugging. Which means we should use tracing, looking something like this:

Console.WriteLine("Thread A requests someLock");
lock (someLock) {
  Console.WriteLine("Thread A got someLock");
  //do some work
  Console.WriteLine("Thread A releases someLock");
}
Console.WriteLine("Thread A released someLock");

Do this wherever you guess the error could happen and you will see in the output window which thread caused it. The only problem is that you cannot use Console.WriteLine(), because:

  • It is way too slow, taking about 500 microseconds on my PC. The traced program would behave rather differently than the not traced program.
  • Console.WriteLine locks the threads if several try to write at the same time, a big No No in real time multi-threading debugging.

Solution

Don't use the Console for tracing, but write your trace into memory. Once the problem occurs, check in the memory what had happened. Writing to the memory must be done with as little overhead as possible, something like this:

public const int MaxMessages = 0x100;
string[] messages = new string[MaxMessages];
int messagesIndex = -1;

public void Trace(string message) {
  int thisIndex = Interlocked.Increment(ref messagesIndex);
  messages[thisIndex] = message;
}

Basically, Trace() writes the message in a string array. Notice that Interlocked.Increment() is multithreading safe and none blocking, as opposed to many of the thread safe methods in the .NET framework which are blocking. This absolute minimum approach needs about 2 microseconds, a delay that should be acceptable. After all, there will always be some time variation, even if the exact same code is executed twice, depending what else the microprocessor is doing, if the code and data is in the processor's cache, etc.

Of course, the above code has also the problem that we get an exception when the array is full, which we can easily solve like this:

const int indexMask = MaxMessages-1;

public void Trace(string message) {
  int thisIndex = Interlocked.Increment(ref messagesIndex) & indexMask;
  messages[thisIndex] = message;
}

This forces the index back to 0, once it has reached the end of the messages array. The buffer can store 0x100 messages, in decimal notion 256. If you need more history, increase MaxMessages. But be careful, 0x1000000 % MaxMessages must be equal 0. Meaning you can use figures like 0x100, 0x200, 0x400, 0x800, 0x1000 but not 0x300.

What happens when messagesIndex reaches int.Max (0x7FFFFFFF or 2,147,483,647) ? Basically the integer will overflow to int.Min (0x8000000 or -2,147,483,648, but luckily no exception gets thrown. The bitwise "And" operator & cuts out the leading 1s and also the very negative number gets mapped into the range 0 - 0x100.

Tracing more information

It would be nice to know which thread was writing the message when. We can achieve it like this:

static string[] threadNames = new string[MaxMessages];
static long[] ticks = new long[MaxMessages];
static Stopwatch stopWatch;

static RealTimeTracer() {
  stopWatch = new Stopwatch();
  stopWatch.Start();
}

public static void Trace(string message) {
  int thisIndex = Interlocked.Increment(ref messagesIndex) & indexMask;
  ticks[thisIndex] = stopWatch.ElapsedTicks;
  messages[thisIndex] = message;
  threadNames[thisIndex] = Thread.CurrentThread.Name;
}

This will increase the time needed by Trace() to about 3 microseconds. Of course, you can add any other information you wish, but try to minimise the creation of new objects, because memory management could add some unwanted delay. 

Retrieving the trace

You need to write some code which detects in real time when the problem occurs. At this time just call RealTimeTracer.GetTrace(), which returns all messages in one string. GetTrace() will stop all tracing for a short time to return a consistent set of messages. You can also just set a breackpoint in Visual Studio when the error occurs and display RealTimeTracer.GetTrace() in the Quick Watch Window. Finally, you can use the property RealTimeTracer.IsStopped to control if tracing should be performed or not.

Another approach is just to let trace running for a while, then stop everything and examine the trace if there is an error.

Interpretation of times measured

I measured the time the Tracer needs like this:

string[] testResults = new string[maxTestCycles];
string testString = "abcdefgh";
for (int testIndex = 0; testIndex < maxTestCycles; testIndex++) {
  Stopwatch stopWatch = new Stopwatch();
  stopWatch.Start();
  for (int i = 0; i < RealTimeTracer.MaxMessages/10; i++) {
    RealTimeTracer.Trace(testString);
  }
  stopWatch.Stop();
  Console.WriteLine("Test " + testIndex + ": " + 
                    stopWatch.Elapsed.TotalMilliseconds/RealTimeTracer.MaxMessages*1000 + 
                    " microseconds.");
}

When I run the very shortest Tracer version, I got the following measurement results:

Test 0: 1.52734375 microseconds.
Test 1: 0.005859375 microseconds
Test 2: 0.00390625 microseconds.
Test 3: 0.00390625 microseconds.
Test 4: 0.00390625 microseconds.

Why does the first test take so much longer? It might be that .NET needs to prepare some stuff, but for sure the processor needs to load everything into cache before execution. In the next cycle, everything is already in the cache and executes therefore faster.

Which time should you use ? One could argue that loading the cache is normal and that will be the time in reality. However, if you try to make you code faster and want to compare different versions, I would recommend taking the fastest times ONLY. The reason is that the results are repeatable, meaning you get the same numbers if you repeat the test. 

Be careful about advice like this:

Quote:

As the general rule, when testing the extremely fast algorithms, it's recommended to implement the loop running 10,000...100,000 iterations in order to increase the accuracy of the measurement.

This is actually not true! Because the longer you let a code run, the higher is the chance that random events of the operating system occur. Imagine if during one test you suddenly have a memory garbage collection running. That time will have nothing to do with your code. Therefore, to measure execution time of code, keep the time short and eliminate all slow results. If your test runs 10 times in 100 microseconds, but takes once 10 milliseconds, it is not your code which is the problem, but something which is not under your control kept the microprocessor busy.

Test approach for multithreaded programs

Even the RealTimeTracer has only few lines of code, it must be thoroughly tested. I recommend testing first sequentially, using only 1 thread. Once the logic works fine for 1 thread, then run another test using several threads.

The solution comes with the console project TestConsole. Program.Main() can run 2 tests:

  1. testSingleThread() writes a running number into the trace using only 1 thread. Then the test verifies that each number is written properly and in the correct sequence into the trace.
  2. testMultiThread() starts writers on several threads writing running numbers. After some time, the main thread stops the writer threads and verifies in the trace that each thread wrote its numbers in sequence and that no number is missing. 

My PC has 8 processor cores, so I use 1 main thread, 6 test threads and leave 1 core for the operating system. Just for fun I run it also with 100 threads, but then the test runs much slower because of all the thread switching. 

Using the code

RealTimeTracer is a class in my TracerLib.dll. I made RealTimeTracer static so that you can start tracing right from the first line of code without the need to setup anything:

RealTimeTracer.Trace("Some Trace Message");

If you need more history, change the constant RealTimeTracer.MaxMessages.

Calling RealTimeTracer.GetTrace() returns the trace as a string. This is convenient for looking at the trace in Visual Studio. It ensures that no new information is traced while the trace is constructed.

With RealTimeTracer.IsStopped, you can control if tracing is done or not.

History

Initial version.

License

This article, along with any associated source code and files, is licensed under A Public Domain dedication

Share

About the Author

Peter Huber SG
Software Developer (Senior)
Singapore Singapore
SW Dev Manager from Switzerland living in Singapore
 
I would be very interested to do SW development in Singapore for an international .NET project.

Comments and Discussions

 
QuestionMinor problem PinmemberKP Lee29-Aug-14 13:57 
AnswerRe: Minor problem PinmemberPeter Huber SG30-Aug-14 18:14 
GeneralRe: Minor problem PinmemberKP Lee31-Aug-14 10:18 
SuggestionExcellent Article - Debugging Techniques are Essential! Pinmemberwlburgess28-Aug-14 7:24 
GeneralRe: Excellent Article - Debugging Techniques are Essential! PinmemberPeter Huber SG28-Aug-14 15:31 
GeneralRe: Excellent Article - Debugging Techniques are Essential! [modified] PinmemberKP Lee29-Aug-14 16:15 
GeneralRe: Excellent Article - Debugging Techniques are Essential! PinmemberPeter Huber SG30-Aug-14 17:59 
GeneralRe: Excellent Article - Debugging Techniques are Essential! PinmemberKP Lee31-Aug-14 11:10 
GeneralMy vote of 5 PinmemberMark_Shield28-Aug-14 5:32 
GeneralRe: My vote of 5 PinmemberPeter Huber SG28-Aug-14 14:14 
Questionarray vs list PinmemberQuintonn27-Aug-14 23:29 
AnswerRe: array vs list PinmemberPeter Huber SG28-Aug-14 4:13 
GeneralMy vote of 5 PinprofessionalCatchExAs14-Jul-14 8:55 
QuestionSystem.Diagnostics.Trace Pinmember_kb_4-Jul-14 20:05 
AnswerRe: System.Diagnostics.Trace PinmemberPeter Huber SG4-Jul-14 23:50 
GeneralRe: System.Diagnostics.Trace Pinmember_kb_6-Jul-14 5:54 
GeneralRe: System.Diagnostics.Trace PinmemberPeter Huber SG6-Jul-14 20:52 
QuestionZip file for project is empty PinmemberLee Ludden3-Jul-14 6:15 
AnswerRe: Zip file for project is empty PinmemberPeter Huber SG3-Jul-14 14:55 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web02 | 2.8.141022.2 | Last Updated 3 Jul 2014
Article Copyright 2014 by Peter Huber SG
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid