Click here to Skip to main content
Click here to Skip to main content
Technical Blog

Writing a .net debugger (part 3) – symbol and source files

, 9 Nov 2010 CPOL
Rate this:
Please Sign up or sign in to vote.
In this part I will show you how to load module debugging symbols (PDB files) into the debugger and how to bind them with source files. This can’t be achieved without diving into process, thread and module internals so we … Continue reading →

In this part I will show you how to load module debugging symbols (PDB files) into the debugger and how to bind them with source files. This can’t be achieved without diving into process, thread and module internals so we will examine these structures also.

Our small debugger mindbg after the last part (part 2) is attached to the appdomains and receives events from the debuggee. Before we start dealing with symbols and sources I will quickly explain what changes were made to the already implemented logic.

I created a new class that will be a parent for all debuggee events:

/// <summary</span></span>></span>
</span></span>/// A base class for all debugging events.
</span></span>/// <</span>/summary</span></span>></span>
</span></span>public</span> class</span> CorEventArgs
{
    private</span> readonly</span> CorController controller;

    /// <summary</span></span>></span>
</span></span>    /// Initializes the event instance.
</span></span>    /// <</span>/summary</span></span>></span>
</span></span>    /// <param</span></span> name="controller"</span>></span>Controller of the debugging process.<</span>/param</span></span>></span>
</span></span>    public</span> CorEventArgs(CorController controller)
    {
        this</span>.controller = controller;
    }

    /// <summary</span></span>></span>
</span></span>    /// Gets the controller.
</span></span>    /// <</span>/summary</span></span>></span>
</span></span>    /// <value</span></span>></span>The controller.<</span>/value</span></span>></span>
</span></span>    public</span> CorController Controller { get</span> { return</span> this</span>.controller;  } }

    /// <summary</span></span>></span>
</span></span>    /// Gets or sets a value indicating whether debugging process should continue.
</span></span>    /// <</span>/summary</span></span>></span>
</span></span>    /// <value</span></span>></span><c</span></span>></span>true<</span>/c</span></span>></span> if continue; otherwise, <c</span></span>></span>false<</span>/c</span></span>></span>.<</span>/value</span></span>></span>
</span></span>    public</span> bool</span> Continue { get</span>; set</span>; }
}

All events are now dispatched to the process that they belong to. As an example take a look at the Breakpoint event handler in CorDebugger:

void</span> ICorDebugManagedCallback.Breakpoint(ICorDebugAppDomain pAppDomain, ICorDebugThread pThread, ICorDebugBreakpoint pBreakpoint)
{
    var</span> ev = new</span> CorBreakpointEventArgs(new</span> CorAppDomain(pAppDomain, p_options),
                                        new</span> CorThread(pThread),
                                        new</span> CorFunctionBreakpoint(
                                               (ICorDebugFunctionBreakpoint)pBreakpoint));

    GetOwner(ev.Controller).DispatchEvent(ev);

    FinishEvent(ev);
}

DispatchEvent method is implemented in the CorProcess. For each type of event that we are interested in, we have an overloaded version of this method. Example:

/// <summary</span></span>></span>
</span></span>/// Handler for CorBreakpoint event.
</span></span>/// <</span>/summary</span></span>></span>
</span></span>public</span> delegate</span> void</span> CorBreakpointEventHandler(CorBreakpointEventArgs ev);

/// <summary</span></span>></span>
</span></span>/// Occurs when breakpoint is hit.
</span></span>/// <</span>/summary</span></span>></span>
</span></span>public</span> event</span> CorBreakpointEventHandler OnBreakpoint;

internal</span> void</span> DispatchEvent(CorBreakpointEventArgs ev)
{
    // stops executing by default (further handlers may change this)
</span></span>    ev.Continue = false</span>;

    // calls external handlers
</span></span>    OnBreakpoint(ev);
}

We want also to stop the debugger on the Main method of the executable module so we will create a function breakpoint in ModuleLoad event handler (more about breakpoints will be in the next part of the series):

internal</span> void</span> DispatchEvent(CorModuleLoadEventArgs ev)
{
    if</span> (!p_options.IsAttaching)
    {
        var</span> symreader = ev.Module.GetSymbolReader();
        if</span> (symreader != null</span>)
        {
            // we will set breakpoint on the user entry code
</span></span>            // when debugger creates the debuggee process
</span></span>            Int32</span> token = symreader.UserEntryPoint.GetToken();
            if</span> (token != 0</span>)
            {
                // FIXME should be better written (control over this breakpoint)
</span></span>                CorFunction func = ev.Module.GetFunctionFromToken(token);
                CorBreakpoint breakpoint = func.CreateBreakpoint();
                breakpoint.Activate(true</span>);
            }
        }
    }
    ev.Continue = true</span>;
}

That’s all about events – I made also some minor changes in other parts of the application but I don’t think they are important enough to be mentioned in this post :) . So let’s focus on the main topic.

I want to display source code for the location where the breakpoint was hit. So first let’s subscribe to the breakpoint event on the newly created process:

var debugger = DebuggingFacility.CreateDebuggerForExecutable(args[0</span>]);
var process = debugger.CreateProcess(args[0</span>]);

process.OnBreakpoint += new</span> MinDbg.CorDebug.CorProcess.CorBreakpointEventHandler(process_OnBreakpoint);

The handler code is as follows:

static</span> void</span> process_OnBreakpoint(MinDbg.CorDebug.CorBreakpointEventArgs ev)
{
    Console.WriteLine("Breakpoint hit."</span></span>);

    var</span> source = ev.Thread.GetCurrentSourcePosition();

    DisplayCurrentSourceCode(source);
}

There are two methods that are mysterious here: CorThread.GetCurrentSourcePosition and DisplayCurrentSourceCode. Let’s start from GetCurrentSourcePosition method. When a thread executes application code it uses a stack to store function’s local variables, arguments and return address. So each stack frame is associated with a function that is currently using it. The most recent frame is the active frame and we may retrieve it using ICorDebugThread.GetActiveFrame method:

public</span> CorFrame GetActiveFrame()
{
    ICorDebugFrame coframe;
    p_cothread.GetActiveFrame(out</span> coframe);
    return</span> new</span> CorFrame(coframe, s_options);
}

and use it to get the current source position:

public</span> CorSourcePosition GetCurrentSourcePosition()
{
    return</span> GetActiveFrame().GetSourcePosition();
}

Inside the active CorFrame we have an access to the function associated with it:

/// <summary</span></span>></span>
</span></span>/// Gets the currently executing function.
</span></span>/// <</span>/summary</span></span>></span>
</span></span>/// <returns</span></span>></span><</span>/returns</span></span>></span>public CorFunction GetFunction()
</span></span>{
    ICorDebugFunction cofunc;
    p_coframe.GetFunction(out</span> cofunc);
    return</span> cofunc == null</span> ? null</span> : new</span> CorFunction(cofunc, s_options);
}

/// <summary</span></span>></span>
</span></span>/// Gets the source position.
</span></span>/// <</span>/summary</span></span>></span>
</span></span>/// <returns</span></span>></span>The source position.<</span>/returns</span></span>></span>
</span></span>public</span> CorSourcePosition GetSourcePosition()
{
    UInt32</span> ip;
    CorDebugMappingResult mappingResult;

    frame.GetIP(out</span> ip, out</span> mappingResult);

    if</span> (mappingResult == CorDebugMappingResult.MAPPING_NO_INFO ||
        mappingResult == CorDebugMappingResult.MAPPING_UNMAPPED_ADDRESS)
        return</span> null</span>;

    return</span> GetFunction().GetSourcePositionFromIP((Int32</span>)ip);
}

The ip variable represents the instruction pointer which (after msdn) is the stack frame’s offset into the function’s Microsoft intermediate language (MSIL) code. That basically means that the ip variable points to the currently executed code. The question now is how to bind this instruction pointer with the real source code line stored in a physical file. Here symbol files come into play. Symbol files (PDB files) may be considered as translators of the binary code into the human readable source code. Unfortunately whole logic behind symbol files is quite complex and explaining it thoroughly would take a lot of space (which might be actually a good subject for few further posts :) ). For now let’s assume that symbol files will provide us with the source file path and line coordinates corresponding to our instruction pointer value. I tried to implement the symbol readers and binders on my own but this subject overwhelmed me and I finally imported all symbol classes and interfaces from MDBG source code. So I will just show you how to use these classes and if someone is not satisfied with it he/she may look and analyze content of the mindbg\Symbols folder.

Each module (CorModule instance) has its own instance of the SymReader class (created with help of the SymbolBinder):

public</span> ISymbolReader GetSymbolReader()
{
    if</span> (!p_isSymbolReaderInitialized)
    {
        p_isSymbolReaderInitialized = true</span>;
        p_symbolReader = (GetSymbolBinder() as</span> ISymbolBinder2).GetReaderForFile(
                                GetMetadataInterface<IMetadataImport>(),
                                GetName(),
                                s_options.SymbolPath);
    }
    return</span> p_symbolReader;
}

Moving back to the CorFrame.GetSourcePosition method code snippet you might have noticed that in the end it called GetSourcePositionFromIP method CorFunction instance associated with this frame. Let’s now load source information from symbol files for this function:

// Initializes all private symbol variables
</span></span>private</span> void</span> SetupSymbolInformation()
{
    if</span> (p_symbolsInitialized)
        return</span>;

    p_symbolsInitialized = true</span>;
    CorModule module = GetModule();
    ISymbolReader symreader = module.GetSymbolReader();
    p_hasSymbols = symreader != null</span>;
    if</span> (p_hasSymbols)
    {
        ISymbolMethod sm = null</span>;
        sm = symreader.GetMethod(new</span> SymbolToken((Int32</span>)GetToken())); // FIXME add version
</span></span>        if</span> (sm == null</span>)
        {
            p_hasSymbols = false</span>;
            return</span>;
        }
        p_symMethod = sm;
        p_SPcount = p_symMethod.SequencePointCount;
        p_SPoffsets = new</span> Int32</span>[p_SPcount];
        p_SPdocuments = new</span> ISymbolDocument[p_SPcount];
        p_SPstartLines = new</span> Int32</span>[p_SPcount];
        p_SPendLines = new</span> Int32</span>[p_SPcount];
        p_SPstartColumns = new</span> Int32</span>[p_SPcount];
        p_SPendColumns = new</span> Int32</span>[p_SPcount];

        p_symMethod.GetSequencePoints(p_SPoffsets, p_SPdocuments, p_SPstartLines,
                                        p_SPstartColumns, p_SPendLines, p_SPendColumns);
    }
}

You may see that our function is represented in Symbol API as SymMethod which contains a collection of sequence points. Each sequence point is defined by the IL offset, source file path, start line number, end line number, start column index and end column index. IL offset is actually the value that interests us most because it is directly connected to the ip variable (which holds instruction pointer value). So finally we are ready to implement CorFunction.GetSourcePositionFromIP method:

public</span> CorSourcePosition GetSourcePositionFromIP(Int32</span> ip)
{
    SetupSymbolInformation();
    if</span> (!p_hasSymbols)
        return</span> null</span>;

    if</span> (p_SPcount >

And the second mysterious function – DisplayCurrentSourceCode – from the beginning of the post is as follows:

static</span> void</span> DisplayCurrentSourceCode(CorSourcePosition source)
{
    SourceFileReader sourceReader = new</span> SourceFileReader(source.Path);

    // Print three lines of code
</span></span>    Debug.Assert(source.StartLine < sourceReader.LineCount && source.EndLine < sourceReader.LineCount);
    if</span> (source.StartLine >= sourceReader.LineCount ||
        source.EndLine >= sourceReader.LineCount)
        return</span>;

    for</span> (Int32</span> i = source.StartLine; i <= source.EndLine; i++)
    {
        String</span> line = sourceReader[i];
        bool</span> highlightning = false</span>;

        // for each line highlight the code
</span></span>        for</span> (Int32</span> col = 0</span>; col < line.Length; col++)
        {
            if</span> (source.EndColumn == 0</span> || col >= source.StartColumn - 1</span> && col <= source.EndColumn)
            {
                // highlight
</span></span>                if</span> (!highlightning)
                {
                    Console.ForegroundColor = ConsoleColor.Yellow;
                    highlightning = true</span>;
                }
                Console.Write(line[col]);
            }
            else</span>
            {
                // normal display
</span></span>                if</span> (highlightning)
                {
                    Console.ForegroundColor = ConsoleColor.Gray;
                    highlightning = false</span>;
                }
                Console.Write(line[col]);
            }
        }
    }
}

SourceFileReader class is just a simple text file reader which reads the whole file at once and stores all lines in a collection of strings. What’s the final result? Have a look:

There is a lot more to say about symbols and source files. I hope that in further posts I will show you how to download symbols from symbol store and source files from repositories. As usually the source code for this post may be found at mindbg.codeplex.com (revision 55200).


Filed under: CodeProject, Debugging

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Sebastian Solnica
Software Developer (Senior)
Poland Poland
Interested in tracing, debugging and performance tuning of the .NET applications (especially ASP.NET).
 
If you find this article interesting, maybe you would like to pay me a visit: http://lowleveldesign.wordpress.com? Smile | :)

Comments and Discussions

 
GeneralNice Pinmemberleppie9-Nov-10 21:14 
GeneralRe: Nice Pinmemberlowleveldesign9-Nov-10 22:06 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web01 | 2.8.1411023.1 | Last Updated 10 Nov 2010
Article Copyright 2010 by Sebastian Solnica
Everything else Copyright © CodeProject, 1999-2014
Layout: fixed | fluid