Click here to Skip to main content
11,501,968 members (63,924 online)
Click here to Skip to main content

Parsing the IL of a Method Body

, 28 Jun 2007 CPOL 96.1K 3.6K 97
Rate this:
Please Sign up or sign in to vote.
This article shows how to get a readable and programmable result from the IL array provided by the MethodBody.GetILAsByteArray() method.

Screenshot of the demp

Introduction

.NET offers through its System.Reflection namespace the possibility to inspect an assembly. You can get all of the types defined inside, the fields, the properties and basically all you need. Still, something is missing: the body of a method. When doing a thorough inspection, you would expect to find the variables used, as well as the cycles and the decisions made inside a method body. Microsoft neglected this need, but still they provided us with something: the IL code. This is not enough, however, as it is actually an array of bytes with no meaning whatsoever to the untrained eyes of a normal programmer.

What is needed is a series of objects that represent the actual instructions from that IL code. That is what I want to provide.

Background

Any programmer who has worked with reflection has heard of the awesome reflector written by Lutz Roeder. The reflector can decompile any .NET assembly and provide the user with the equivalent code for each programming element within the given assembly.

You observed that I said "equivalent." This is mainly because the reflection mechanism cannot provide you with the original code. The compilation process removes any comments and unused variables first. Only the valid and necessary code is added to the compiled code. Thus, we cannot obtain the exact code.

The reflector is a wonderful tool, but we might want to obtain similar results with our own code. How can we do that? Let us look first at the classic "hello world" example to see what we want to achieve and what is actually provided to us by the framework. This is the classic C# code:

public void SayHello()
{
    Console.Out.WriteLine("Hello world");
}

When we get the body of the SayHello method using reflection and ask for the IL code, we get an array of bytes such as:

0,40,52,0,0,10,114,85,1,0,112,111,53,0,0,10,0,42

Well, that's not very readable. What we know is that this is IL code and we want to transform it so that we can process it. The easiest way is to transform it to MSIL (Microsoft Intermediate Language). This is what the MSIL code of the SayHello method looks like and what my library is supposed to return:

0000 : nop
0001 : call System.IO.TextWriter System.Console::get_Out()
0006 : ldstr "Hello world"
0011 : callvirt instance System.Void System.IO.TextWriter::WriteLine()
0016 : nop
0017 : ret

Using the Code

SDILReader is a library containing only three classes. In order to obtain the MSIL of the body of a method, one must simply create a MethodBodyReader object and pass to its constructor a MethodInfo object of the object you want to decompose.

MethodInfo mi = null;
// obtain somehow the method info of the method we want to dissasemble
// ussually you open the assembly, get the module, get the type and then the 
// method from that type 
// 
...
// instantiate a method body reader
SDILReader.MethodBodyReader mr = new MethodBodyReader(mi);
// get the text representation of the msil
string msil = mr.GetBodyCode();  
// or parse the list of instructions of the MSIL
for (int i=0; i<mr.instructions.Count;i++)
{
    // do something with mr.instructions[i]
}

How's It Working

Well, this is the right question. In order to get started, we first need to know the structure of the IL array that is given by the .NET reflection mechanism.

IL Code Structure

The IL is in fact an enumeration of operations that must be executed. An operation is a pair: <operation code, operand>. The operation code is the byte value of System.Reflection.Emit.OpCode, while the operand is the address of the metadata information for the entity the operator is working with, i.e., a method, type, value. This address is referred to as the metadata token by the .NET framework. So, in order to interpret the array, we must do something like this:

  • Get the next byte and see what operator we are dealing with.
  • Depending on the operator, the metadata token is defined in the next 1, 2, 3 or 4 bytes. Get the metadata token of the operand.
  • Use the MethodInfo.Module object to retrieve the object whom the metadata token is addressing.
  • Store the pair <operator, operand>.
  • Repeat if we are not at the end of the IL array.

ILInstruction

The ILInstruction class is used for storing the <operator, operand> pair. Also, we have there a simple method that transforms the inner information into a readable string.

MethodBodyReader

The MethodBodyReader class is doing all the hard work. Inside the constructor a private method, ConstructInstructions, is called that parses the IL array:

int position = 0;
instructions = new List<ILInstruction>();
while (position < il.Length)
{
    ILInstruction instruction = new ILInstruction();

    // get the operation code of the current instruction
    OpCode code = OpCodes.Nop;
    ushort value = il[position++];
    if (value != 0xfe)
    {
        code = Globals.singleByteOpCodes[(int)value];
    }
    else
    {
        value = il[position++];
        code = Globals.multiByteOpCodes[(int)value];
        value = (ushort)(value | 0xfe00);
    }
    instruction.Code = code;
    instruction.Offset = position - 1;
    int metadataToken = 0;
    // get the operand of the current operation
    switch (code.OperandType)
    {
        case OperandType.InlineBrTarget:
            metadataToken = ReadInt32(il, ref position);
            metadataToken += position;
            instruction.Operand = metadataToken;
            break;
        case OperandType.InlineField:
            metadataToken = ReadInt32(il, ref position);
            instruction.Operand = module.ResolveField(metadataToken);
            break;
        ....
    }
    instructions.Add(instruction);
}

We see here the simple loop for parsing the IL. Well, it's not quite simple. It actually has 18 cases and I did not take into account all of the operators, only the most common ones. There are 240+ operators. The operators are loaded into two static lists at the start of the application:

public static OpCode[] multiByteOpCodes;
public static OpCode[] singleByteOpCodes;

public static void LoadOpCodes()
{
    singleByteOpCodes = new OpCode[0x100];
    multiByteOpCodes = new OpCode[0x100];
    FieldInfo[] infoArray1 = typeof(OpCodes).GetFields();
    for (int num1 = 0; num1 < infoArray1.Length; num1++)
    {
        FieldInfo info1 = infoArray1[num1];
        if (info1.FieldType == typeof(OpCode))
        {
            OpCode code1 = (OpCode)info1.GetValue(null);
            ushort num2 = (ushort)code1.Value;
            if (num2 < 0x100)
            {
                singleByteOpCodes[(int)num2] = code1;
            }
            else
            {
                if ((num2 & 0xff00) != 0xfe00)
                {
                    throw new Exception("Invalid OpCode.");
                }
                multiByteOpCodes[num2 & 0xff] = code1;
            }
        }
    }
}

Upon constructing the object, we can use the object to either parse the list of instructions or get the string representation of them. That's it; have fun decompiling.

Future Work

Well, what's left is to transform the MSIL into C# code.

History

9 May, 2006

  • Original version posted

28 June, 2007

After a very long time, I managed to have a look at the issues signaled by the readers of my article. Here are the results:

  • I added support for generics.
  • Now OperandType.InlineTok is also correctly processed.
  • Various other small issues have been fixed.

Be sure to download the sources again from the links at the start of the project.

References

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Sorin Serban
Web Developer
Romania Romania
.Net Developer specialized in ASP.Net 2.0 Applications with AJAX technology

Comments and Discussions

 
GeneralMy vote of 5 Pin
mohsen.keshavarzi11-Nov-14 0:25
membermohsen.keshavarzi11-Nov-14 0:25 
GeneralMy vote of 5 Pin
Champion Chen26-Sep-14 19:46
memberChampion Chen26-Sep-14 19:46 
QuestionSome improvements Pin
beppel9-Jun-14 23:13
memberbeppel9-Jun-14 23:13 
QuestionLicense Pin
Alec McGinn14-Feb-14 11:09
memberAlec McGinn14-Feb-14 11:09 
QuestionDoesn't work with .net 4.0 Pin
dburkolter18-Jun-12 10:57
memberdburkolter18-Jun-12 10:57 
AnswerRe: Doesn't work with .net 4.0 [modified] Pin
Jimmie Clark10-Nov-13 10:55
memberJimmie Clark10-Nov-13 10:55 
GeneralRe: Doesn't work with .net 4.0 Pin
tejasvi30-Dec-14 2:09
membertejasvi30-Dec-14 2:09 
BugGood Job! But one thing Pin
JrrS8821-Jun-11 7:02
memberJrrS8821-Jun-11 7:02 
GeneralHey man, I just used this in an article of my own Pin
Sacha Barber6-Jun-11 21:49
mvpSacha Barber6-Jun-11 21:49 
GeneralMy vote of 5 Pin
gouderadrian19-Jan-11 3:21
membergouderadrian19-Jan-11 3:21 
GeneralOpcode bit masks as magic numbers Pin
SAKryukov8-Jan-11 11:07
memberSAKryukov8-Jan-11 11:07 
GeneralProblem Pin
Stofde18-Dec-10 7:28
memberStofde18-Dec-10 7:28 
Generalhaving a problem [modified] Pin
paragenic12-Oct-10 14:25
memberparagenic12-Oct-10 14:25 
GeneralThanks! Pin
Oleg Zhukov19-Aug-09 15:33
memberOleg Zhukov19-Aug-09 15:33 
QuestionContact? Pin
Greg Ennis1-Oct-07 6:25
memberGreg Ennis1-Oct-07 6:25 
GeneralBug Fix for Generic Pin
Fadrian Sudaman1-Sep-07 16:50
memberFadrian Sudaman1-Sep-07 16:50 
GeneralQuestion Pin
Marc Clifton1-Jul-07 3:36
protectorMarc Clifton1-Jul-07 3:36 
GeneralGreat work Pin
Moim Hossain28-Jun-07 7:59
memberMoim Hossain28-Jun-07 7:59 
GeneralGeneric context Exception Pin
al01175718-Feb-07 11:33
memberal01175718-Feb-07 11:33 
GeneralRe: Generic context Exception Pin
Sorin Serban28-Jun-07 6:51
memberSorin Serban28-Jun-07 6:51 
GeneralAnother suggestion Pin
Leif Wickland1-Feb-07 8:00
memberLeif Wickland1-Feb-07 8:00 
GeneralRe: Another suggestion Pin
Sorin Serban28-Jun-07 6:52
memberSorin Serban28-Jun-07 6:52 
GeneralThanks, and a bug fix suggestion Pin
Leif Wickland31-Jan-07 12:57
memberLeif Wickland31-Jan-07 12:57 
GeneralThanx Sorin Pin
Simon Franklin8-Jan-07 6:02
memberSimon Franklin8-Jan-07 6:02 
Generalconstrained. Pin
torq31415-May-06 4:21
membertorq31415-May-06 4:21 
GeneralRe: constrained. Pin
Sorin Serban28-Jun-07 6:53
memberSorin Serban28-Jun-07 6:53 
GeneralNice! Pin
jconwell9-May-06 8:01
memberjconwell9-May-06 8:01 
GeneralAmazing Pin
NinjaCross9-May-06 6:51
memberNinjaCross9-May-06 6:51 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web03 | 2.8.150520.1 | Last Updated 28 Jun 2007
Article Copyright 2006 by Sorin Serban
Everything else Copyright © CodeProject, 1999-2015
Layout: fixed | fluid