In this part I will show you how to load module debugging symbols (PDB files) into the debugger and how to bind them with source files. This can’t be achieved without diving into process, thread and module internals so we will examine these structures also.
Our small debugger mindbg after the last part (part 2) is attached to the appdomains and receives events from the debuggee. Before we start dealing with symbols and sources I will quickly explain what changes were made to the already implemented logic.
I created a new class that will be a parent for all debuggee events:
All events are now dispatched to the process that they belong to. As an example take a look at the Breakpoint event handler in CorDebugger:
DispatchEvent method is implemented in the CorProcess. For each type of event that we are interested in, we have an overloaded version of this method. Example:
We want also to stop the debugger on the Main method of the executable module so we will create a function breakpoint in ModuleLoad event handler (more about breakpoints will be in the next part of the series):
That’s all about events – I made also some minor changes in other parts of the application but I don’t think they are important enough to be mentioned in this post . So let’s focus on the main topic.
I want to display source code for the location where the breakpoint was hit. So first let’s subscribe to the breakpoint event on the newly created process:
var debugger = DebuggingFacility.CreateDebuggerForExecutable(args[0</span>]);
var process = debugger.CreateProcess(args[0</span>]);
process.OnBreakpoint += new</span> MinDbg.CorDebug.CorProcess.CorBreakpointEventHandler(process_OnBreakpoint);
The handler code is as follows:
There are two methods that are mysterious here: CorThread.GetCurrentSourcePosition and DisplayCurrentSourceCode. Let’s start from GetCurrentSourcePosition method. When a thread executes application code it uses a stack to store function’s local variables, arguments and return address. So each stack frame is associated with a function that is currently using it. The most recent frame is the active frame and we may retrieve it using ICorDebugThread.GetActiveFrame method:
and use it to get the current source position:
Inside the active CorFrame we have an access to the function associated with it:
The ip variable represents the instruction pointer which (after msdn) is the stack frame’s offset into the function’s Microsoft intermediate language (MSIL) code. That basically means that the ip variable points to the currently executed code. The question now is how to bind this instruction pointer with the real source code line stored in a physical file. Here symbol files come into play. Symbol files (PDB files) may be considered as translators of the binary code into the human readable source code. Unfortunately whole logic behind symbol files is quite complex and explaining it thoroughly would take a lot of space (which might be actually a good subject for few further posts ). For now let’s assume that symbol files will provide us with the source file path and line coordinates corresponding to our instruction pointer value. I tried to implement the symbol readers and binders on my own but this subject overwhelmed me and I finally imported all symbol classes and interfaces from MDBG source code. So I will just show you how to use these classes and if someone is not satisfied with it he/she may look and analyze content of the mindbg\Symbols folder.
Each module (CorModule instance) has its own instance of the SymReader class (created with help of the SymbolBinder):
Moving back to the CorFrame.GetSourcePosition method code snippet you might have noticed that in the end it called GetSourcePositionFromIP method CorFunction instance associated with this frame. Let’s now load source information from symbol files for this function:
You may see that our function is represented in Symbol API as SymMethod which contains a collection of sequence points. Each sequence point is defined by the IL offset, source file path, start line number, end line number, start column index and end column index. IL offset is actually the value that interests us most because it is directly connected to the ip variable (which holds instruction pointer value). So finally we are ready to implement CorFunction.GetSourcePositionFromIP method:
And the second mysterious function – DisplayCurrentSourceCode – from the beginning of the post is as follows:
SourceFileReader class is just a simple text file reader which reads the whole file at once and stores all lines in a collection of strings. What’s the final result? Have a look:
There is a lot more to say about symbols and source files. I hope that in further posts I will show you how to download symbols from symbol store and source files from repositories. As usually the source code for this post may be found at mindbg.codeplex.com (revision 55200).
Filed under: CodeProject