|

Introduction
This is the third, fourth (and the last) part for building a thread deadlock detector. Please see the first and second articles to understand what is going on, at A (working) implementation of API hooking (Part II).
In fact, I've added a small library called SetThreadName which allows you to set the thread name and to get meaningful names for the synchronization object. The library does nothing in Release build, you don't even need it.
Remarks: This software will work even without this DLL, you don't have to recompile anything, unless you want to have your thread and object named.
The thread naming trick allows the thread name to be supported in Visual debugger (under Debug\Thread menu, real names instead of 0x0000C340), so it is of 100% benefit if you use it, even if you are not using my software. For the automatic object naming, the following algorithm is used :
- The first is the process clock value, so that a function creating the object will still create different object names.
- The second is the process ID, so that you can run your application multiple times, it will still have a unique name.
- The third string is the object type. (Mutex, Event, Semaphore, etc.)
- The fourth string is the file name where the object was created.
- The fifth number is the line where the object was created.
Part III : The thread deadlock detector
The background
If you've followed my previous articles, then you should know what is API hooking, and how to use it to spy what is going on in any application (Part I). You should also know what are we interested in for a thread deadlock detector, and what are the tricks to get all the required information (Part II).
The idea
Now that we can get every call to any synchronization and thread function, we face a problem. How can we know when a target application is deadlocked or not? The algorithm I'm using here is quite simple. Here is an example of a deadlock: Let's take 2 threads (A and B), and 2 objects (o0 and o1).
Thread B locks o0, and then locks o1.
In parallel, Thread A locks o1, and then locks o0.
If o0 and o1 are free (ready to be locked), then they are 4 possibles cases:
1) Thread B is executed, and not interrupted, then thread A is executed.
2) Thread A is executed, and not interrupted, then thread B is executed.
3) Thread B is executed, but gets interrupted before locking o1, then thread A
is executed, and waits for o0. Thread B is then executed, and waits for o1.
4) Thread A is executed, but gets interrupted before locking o0, then thread B
is executed, locks o0 and waits for o1. Thread A is then executed,
and waits for o0.
It is clear that case 1 and 2 are okay. However, when reaching case 3 or 4, the application goes to deadlock. To avoid this, the server (deadlock detector) monitors each object and thread using its CSyncObject and CThread classes. Then, each object keeps a track on who owns it (a thread list). Similarly, each thread has an object waiting list and locked list. Now let's see how the algorithm finds the deadlock: Case 3:
Time
0 Thread B locks o0 (o0 now has Thread B in its list
and Thread B have o0 in its Locked list)
1 Thread B is interrupted
2 Thread A locks o1 (o1 now has Thread A in its list
and Thread A have o1 in its Locked list)
3 Thread A tries to lock o0
(as o0 is already locked, we look inside it to find who got it.
we find Thread B, so then we check if thread B is waiting for
any object current thread (thread A) may have. In that case,
the waiting list of thread B is empty, so we add o0 to our
waiting list)
4 Thread A is interrupted
5 Thread B tries to lock o1
(as o1 is already locked, we look inside it to find who'got it.
we find Thread A, so then we check if thread A is waiting for
any object current thread (thread B) may have. Thread A is
waiting for o0 but o0 is in our locked list => deadlock)
The algorithm is quite simple but works out of the box.
The implementation
We need a server called ThreadDLD (what a wonderful name, isn't it?). Its purpose is to:
- Launch the debuggee (can be any application with or without source code).
- Inject the spying DLL in it, and make it infect the debuggee.
- Receive the thread monitoring function.
- Receive any API sniffing from the client.
- Parse the sniff, and display them as log.
- Analyze the sniff and spot errors (deadlocks).
The debuggee is launched in CMainFrame::OnFileOpen in suspended state. The spying DLL ThreadSpy.DLL is then injected using the usual CreateRemoteThread trick. Then the debuggee is resumed, and the server waits for any message from it. The debuggee then sends the StartMeUp command with the thread monitoring function address, and sends any hooked command to the server (with stack trace and timestamp). The server waits for any CommunicationObject from the debuggee in its CThreadDLDView::ReceivedMessage. The server then parses the message and logs it accordingly. There are four logging modes, from the simple mode to the Analysis mode. Each of them reports the same information but from a different point of view.
-
Log mode
In this mode the received messages are shown un-factored, and is not intelligent. However, this is the fastest reporting mode, and should be used while reproducing the deadlock in the debuggee.
-
Thread life
In this mode the received messages are shown from the thread point of view. No deadlock detection is done in this mode. However, this mode is perfect to check each thread behavior.
-
Object life
In this mode the received messages are shown from the object point of view. No deadlock detection is done in this mode. However, this mode is perfect to check what happens to any object you are monitoring.
-
Analysis
In this mode the received messages are analyzed and the deadlock detection is performed on the fly. This is the slowest mode. However, this is the only mode that will outline thread deadlocks and errors. The algorithm described above is defined in CThread::CheckLock method. The objects are declared in SyncObject.h.
Part IV : The bonus track
Each CommunicationObject sent between the debuggee and the server contains the stack trace in the debuggee. This stack trace is useful to spot where the deadlock has occurred. The problem with stack traces, is mainly due to their lack of meaning (when an error occurs at 0x00401345, I'm almost sure it doesn't tell you much). The idea, is to use a map file (if available), to map the address from the stack trace to real functions. I've included a MapFileParser to reverse the addresses into undecorated function names. It will not give you the line number, but anyway it is better than nothing. (Map file can be built in Release build too without any risks, as they are separate files). The map file parser finds the function that is just before the given address. This will not work for DLLs, as it is really not possible to know where the DLL will be mapped (except if you specify it by yourself like Google says "Mark Pietrek").
To conclude
I looked around to find such a tool for about a month, and because none of them where available, here is mine. It is obvious that this is not a "professional" software. For example, it cannot detect potential deadlock like tools that cover code statically, it will only detect real deadlock. I'm sure I can add the functionality for the same, because I have all the needed data. This project is made with ATL and WTL, so I would rather encourage you to learn those tools. I've implemented an owner drawn CListViewCtrl as I didn't find any good one around. The implementation is in CThreadDLDView. It is not possible to save the log, or read it. I've kept the print icon, but there is no print code. If you want to upgrade/add functionalities please post a line or two below:
What I would like to see is:
- function::line number in the map field (it is possible to generate a map file with line numbers too, but then the
MapFileParser will be more complex).
- Reply to the debuggee to prevent it from deadlocking (as we know the situation before it really happens in the debuggee process). This could be done without thread and delay by using shared memory area.
Update
- April 1st, 2005
- Added mapping of imported DLL too (so now, you should be able to locate deadlock even a in DLL).
- You can use the software to see what modules are loaded in a target process.
- Analyze mode can now detect possible deadlocks (yes, even those that deadlock only in client site).
- Can pass parameters to the program being analyzed.
- Can save the analysis to a file (yes, and can be imported in Excel too).
- Feb - 2005
Known bugs and/or issues
- The screen flickers a lot while collecting data. This is not an issue, minimize the server if you don't want to see live update of the collected data.
- Still cannot see the line where the deadlock occurs. I haven't finished the MAP file line reading code, but you can still browse it by hand
- Why such "wonderful" feature as ... isn't already there. Okay, feel free to implement it. Please contribute by posting your modifications here.
| You must Sign In to use this message board. |
|
| | Msgs 1 to 15 of 15 (Total in Forum: 15) (Refresh) | FirstPrevNext |
|
|
 |
|
|
 |
|
|
Hi
Most of the Flicker can be eliminated by replacing the following lines in ThreadDLDView.cpp ...
Replace
if (bRedraw) RedrawWindow();
with
if (bRedraw) { EnsureVisible(xLimit.High - 1, FALSE); //RedrawWindow(); }
or
if (bRedraw) { EnsureVisible(xLimit.High - 2, FALSE); //RedrawWindow(); }
whatever is more appropriate. Don't replace everything. Used in the proper places this creates a pleasant scrolling effect.
C ya Bruce
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Thank you for your comment.
I've updated the detector since the posted versions, but because of the 15 days limit for editing the article, I was lazy posting it to codeproject administrator for update.
Anyway, I'll integrate the change you've made. Thanks.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Hi
First off; thanks for a most excellent piece of work.
Regrettably the Solution can not be used as it is by .NET Applications. The problem is that .NET apps die (somewhere in the CLR) when a DLLMain with DLL_PROCESS_ATTACH calls a blocking Syncronisation function. Internally InitializeCriticalSection calls EnterCriticalSection not to mention all the other work being done.
I solved this by moving everything from case DLL_PROCESS_ATTACH: other than
hCurrentHandle = hModule; too
extern "C" __declspec(dllexport) BOOL PostLoadInit() { … and then calling this directly from my .NET app
[DllImport("ThreadSpy.dll", EntryPoint="PostLoadInit")] public static extern int PostLoadInit();…
static void Main(string[] args) { try { Program.PostLoadInit();…
This works.
It’s a bummer that I have to change and recompile my app to get this to work. (This is also probably against your initial intentions.)
I tried using ::CreateRemoteThread to call PostLoadInit after loading but that failes because I can’t find the proper address of PostLoadInit in the debugee. (The Address is not fixed like the Kernel32 functions.)
Do you have a suggestion on how to refactor this code so that the explicit call to PostLoadInit from the Debuggie would no longer be required?
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Hi man, where i can get file ThreadSpy.dll
I trying make code in .Net for Look when any external program use API WriteProcessMemory and using in parameters the PID value the my Game Process.
the PID the my game process,i get when launch game from my Code.
You think is posible using this ThreadSpy.dll ?
Thank you.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
This one is trickier than the previous bugs...
But I think I have a solution for this problem (untested as I'm fleeing .NET like plague)
Solution 1 : Simple
Follow your idea (create the PostloadInit function). Then build the ThreadSpy.DLL by specifying another load address than 0x10000000 in Project Setting/Link/Output/Base address tab (something like 0x98000000) You'll have to enable the map file creation.
Then look inside the map file where is PostloadInit (warning, it's not the first column address, but the RVA+Base column that is important).
Write down this address in ThreadDLD and use the ::CreateRemoteThread trick with this "hand written" address.
This is for the good news.
The bad news is that if another DLL is mapped to the base address you've specified, ThreadSpy.DLL will be remapped and ::CreateRemoteThread will make your debuggee crash.
Solution 2 : Harder Follow your idea (create the PostloadInit function). Then create a static char array like this: static char MagicCodeToFindPostloadInitFunction[16] = { 0x13, 0x15, 0x01, 0x9A, 0x5F, 0xDE, 0x6C, 0x03, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 };
Then add "memcpy(&MagicCodeToFindPostloadInitFunction[8], (void*)PostloadInit, sizeof(void*));" to your DLLMain
Then, in ThreadDLD, use ReadProcessMemory to find the magic number (which is 0x9A011513, 0x036CDE5F). Once it is found, the next 4 or 8 bytes are the PostloadInit function's address in your debuggee. CreateRemoteThread which that address and if should run okay.
The good news is that, given that the magic number is improbable, you'll adapt to any mapping done by the dynamic linker. The bad news is that you'll have to read the whole process memory in order to find the magic number (with all the problems it creates "i.e" exceptions and so one).
Solution 3 : Inject in the .NET virtual machine If you inject in the .NET virtual machine with your .NET executable as argument, you should be able to avoid any modification. However, you'll get a lot of garbage due to .NET core which will make debugging very very hard.
Solution 4 : Combination Combine the 2 first solutions and you'll only have to deal with the process memory scanning only when it is dynamically relinked (which will be very, very rare).
Anyway, send me a email on __my__login__here__ at @gmail, and I'll send you the last version (with a lot of improvement like .MAP parsing for DLL too and function:source file:line display for mapped components).
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Hi
Thanks for your response. The new Ideas you presented are interesting but I took the opposite approach. First off I got the hooks working like in your solutions 3: Inject in the .NET virtual Machine. The following two points lead me in a totally different direction.
First, finding deadlocks in .NET code by injecting the VM was not just difficult, it is not possible. This is mainly due to the fact that there is no direct 1 to 1 correlation between some .NET locking objects and the underlying NT locking objects. Apparently the VM uses NT locking for static factories and the rest of the locking code is custom.
Secondly, to detect a deadlock with 100% Certainty, you need to know the blocker and the blockee at every locking call. You are obviously very familiar with this problem because you solved it by using a ‘Probably Locked’ indicator. An easy and common solution to this problem is to use a locker for every lock and a signaller/signallee for every signal. The .NET classes however, do not have such objects. (A good example is the MFC CSingleLock CMultipleLock vs the CCriticalSection or CMutex.)
Instead of avoiding .NET I dove in with both feet. I implemented Locker/Lockee and signaller/signallee classes to wrap the .NET locking objects. These classes form the vertexes in a Graph which I then traverse for cycles when locking. When I detect a Deadlock then I throw a Deadlock exception.
The advantage is 100% certainty. The disadvantage is that source code changes are required.
My point is .NET regrettably requires a different approach (Perhaps by modifying the bytecode). Don’t bother with .NET. Your solution is perfect for Native NT projects. I’m looking forward to the new version.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
If it doesn't bother you too much (and if it is possible), can you post your solution in an article here (or even a comment), so I set up a link in the next article update ?
I didn't think of .NET case too much (what is inside the VM code is very obscure to me in fact).
I think it might be possible to prevent deadlock in .NET code only by using the reflection API, and static covering. Dynamic .NET deadlock detection will (if possible) be very very "delicate" to implement without modifying the source.
If only the Mono VM could be modified easily to perform such detection on the fly...
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Hi
I would have to do a full rewrite at home because I developed the code for a customer. Legally, the code does not ‘belong’ to me. The solution also does not cover every possible deadlock and it’s not trivial to use. We use it mainly for Machine generated code. The main problem is that Microsoft declared all the required classes as sealed. Sealed is a C# class modifier that prevents you from deriving from the sealed class. The perfect, complete solution would require being able to derive from the existing classes. Since I can not derive from the classes I would have to rewrite them… That is too much work.
I think, however, that a generic - dynamic deadlock detection using reflection is possible without many source code changes. A separate deadlock detection thread could analyse the stack of any thread which is blocked longer than X milliseconds. Such a project would be very interesting, when I get the time.
I received an error trying to mail your Private address. Failed to deliver to 'XXXX' failed to route the address
My address is [FirstName]@[LastName]Sol.de (Sol for Solutions)
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Hi
First off; thanks for a most excellent piece of work.
I believe the following is a copy-paste error. MyWaitForMultipleObjectsEx ThreadSpy Hooked.cpp Line 883 should be:
DWORD dwRet = CallFunction(WaitForMultipleObjectsEx)(nCount, lpHandles, bWaitAll, dwMilliseconds, bAlertable);
And not
DWORD dwRet = CallFunction(WaitForMultipleObjects)(nCount, lpHandles, bWaitAll, dwMilliseconds);
Is this correct?
C ya Bruce Ricker
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Yes, exactly. It could have created issues with software using the alertable state (such as servers).
Thank you for spotting the bug, I've made the modification (but I haven't posted it).
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
here the output Compiling... Hooked.cpp G:\x\xx\Hook\ThreadSpy\Hooked.cpp(227) : error C2065: 'WC_NO_BEST_FIT_CHARS' : undeclared identifier ThreadSpy.cpp G:\x\xx\Hook\ThreadSpy\ThreadSpy.cpp(711) : error C2146: syntax error : missing ';' before identifier 'frame' G:\x\xx\Hook\ThreadSpy\ThreadSpy.cpp(711) : error C2065: 'frame' : undeclared identifier G:\x\xx\Hook\ThreadSpy\ThreadSpy.cpp(716) : error C2228: left of '.AddrPC' must have class/struct/union type
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
First error: WC_NO_BEST_FIT_CHARS is declared close to WideCharToMultiBytes, so maybe you just need to increase the WINVER macro to 0x500 or 0x510 in stdafx.h to make unicode available. Else WC_NO_BEST_FIT_CHARS is defined like #define WC_NO_BEST_FIT_CHARS 0x00000400
Second error: I think you should include the last dbghelp.h and Winnls.h (available from microsoft in platform SDK files). This is where the STACKFRAME64 object is declared. You can also rename the new GetStackTrace function (ThreadSpy.cpp:705) to something else and rename the previous GetStackTraceOld (ThreadSpy.cpp:797) to GetStackTrace (and of course, remove the #if 0 block). This is because StackFrame api changed to handle 64 bit processors, but I'm not sure the new api is available on Win2K.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Nice app!
I'm wondering is it possible for you to add support for argument passing to the process that is forked?
Regards Niklas
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Sure,
change the mainframe.cpp:147
if (!::CreateProcess(sName, NULL, NULL, NULL, TRUE, CREATE_SUSPENDED, 0, 0, &startup, &m_view.GetPI())) return false;
to
if (!::CreateProcess(sName, YOUR_COMMAND_LINE_ARGUMENT_HERE, NULL, NULL, TRUE, CREATE_SUSPENDED, 0, 0, &startup, &m_view.GetPI())) return false;
:->
I've now included a edit argument setting so it is easier to set the parameter in the file open dialog.
|
| Sign In·View Thread·PermaLink | 5.00/5 (2 votes) |
|
|
|
 |
|
|
General News Question Answer Joke Rant Admin
|