Steganography VI - Hiding messages in .NET Assemblies

Corinna John

4.58/5 (18 votes)

Nov 23, 2003

CDDL

6 min read

76368

1145

An article about hiding instructions at the end of methods in an .NET Assembly

Download source files - 11.8 Kb

Introduction

An application contains lots of lines which leave the stack empty. After these lines any code can be inserted, as long as it leaves the stack empty again. You can load some values onto the stack and store them off again, without disturbing the application's flow.

Finding silent hiding places

Let's take a look at an assembly's IL Assembler Language code. Each methods contains lines which put something onto the stack, or store something off the stack. We cannot always say what exactly is on the stack when a specific line executes, so we should not change anything between two lines. But there are some lines at which we know what is on the stack.

Every method has to contain at least one ret instruction. When the runtime environment reaches a ret, the stack must contain the return value and nothing else. That means, at a ret instruction in a method returning a Int32, the stack contains exactly one Int32 value. We could store it in a local variable, insert some code leaving the stack empty, and then put the return value back onto the stack. Nobody would notice it at runtime. There are much more lines like that, for example the closing brackets of .try { and .catch { blocks (definitly empty stack!) or method calls (only returned value of known type on the stack!). To keep the example simple, we are going to concentrate on void methods and ignore all the others. When a void method is left, the stack has to be empty, so we don't have to care about return values.

This is the IL Assembler Language code of a typical void Dispose() method:

.method family hidebysig virtual instance void 
            Dispose(bool disposing) cil managed
    {
      // Code size       39 (0x27)
      .maxstack  2
      IL_0000:  ldarg.1
      IL_0001:  brfalse.s  IL_0016

      IL_0003:  ldarg.0
      IL_0004:  ldfld      class [System]System.ComponentModel.Container
          PictureKey.frmMain::components
      IL_0009:  brfalse.s  IL_0016

      IL_000b:  ldarg.0
      IL_000c:  ldfld      class [System]System.ComponentModel.Container
          PictureKey.frmMain::components
      IL_0011:  callvirt   instance void [System]System.ComponentModel
          .Container::Dispose()
      IL_0016:  ldarg.0
      IL_0017:  ldarg.1
      IL_0018:  call       instance void [System.Windows.Forms]System
          .Windows.Forms.Form::Dispose(bool)
      IL_0026:  ret
    }

So what will happen, if we insert a new local variable and store a constant in it, just before the method returns? Yes, nothing will happen, except a little bit of performance decrease.

.method family hidebysig virtual instance void 
            Dispose(bool disposing) cil managed
    {
      // Code size       39 (0x27)
      .maxstack  2
      .locals init (int32 V_0) //declare a new local variable 
      
      ...

      IL_001d:  ldc.i4     0x74007a //load an int32 constant
      IL_0022:  stloc      V_0 //store the constant in the local variable
      IL_0026:  ret
    }

In C# the methods would look like this:

//Original
protected override void Dispose( bool disposing ) {
    if( disposing ) {
        if (components != null) {
            components.Dispose();
        }
    }
    base.Dispose( disposing );
}

//Version with hidden variable
protected override void Dispose( bool disposing ) {
    int myvalue = 0;
    if( disposing ) {
        if (components != null) {
            components.Dispose();
        }
    }
    base.Dispose( disposing );
    myvalue = 0x74007a;
}

We have just hidden four bytes in an application! The IL file will re-compile without errors, and if somebody de-compiles the new assembly, he can find the value 0x74007a.

How to disguise a secret value

To make life harder for people who disassemble our application and look for useless variables, we can disguise the hidden values as forgotten debug output:

ldstr bytearray(65 00) //load an "A"
stloc mystringvalue    //store it
.maxstack  2           //set the stack size to exclude runtime exceptions
ldstr "DEBUG - current value is: {0}"
ldloc mystringvalue    //simulate forgotten debug code
call void [mscorlib]System.Console::WriteLine(string, object)

In order to stay invisible even in console applications, we should rather disguise it as an operation. We can insert more local/instance/static variables, to make it look like the values were needed somewhere else:

.maxstack  2  //adjust stack size
ldc.i4 65     //load the "A"
ldloc myintvalue //load another local variable - declaration inserted above
add           //65 + myintvalue
stsfld int32 NameSpace.ClassName::mystaticvalue 
    //remove the result from the stack

This example demonstrates how to hide values at all, so only this version will be used:

ldc.i4 65;
stloc myvalue

There is no need to insert two lines for each byte of the message. We can combine up to four bytes to one Int32 value, inserting only half a line per hidden byte. But first we have to know where to insert it at all.

Analysing the Disassembly

Before editing the IL file, we have to call ILDAsm.exe to create it from the compiled assembly. Afterwards we call ILAsm.exe to re-assemble it. The interesting part is between these two steps: We must walk through the lines of IL Assembler Language code, finding the void methods, their last .locals init line, and one ret line. A message can contain more 4-byte blocks than there are void methods in the file, so we have to count the methods and calculate the number of bytes to hide in each of them. The method Analyse collects namespaces, classes and void methods:

/// <summary>Lists namespaces, classes and methods 
///  with return type "void"</summary>
/// <param name="fileName">Name of the IL file to analyse</param>
/// <param name="namespaces">Returns the names of all namespaces 
/// found in the file</param>
/// <param name="classes">Returns the names of all classes</param>
/// <param name="voidMethods">Returns the first lines of all method
/// signatures</param>
public void Analyse(String fileName,
    out ArrayList namespaces, out ArrayList classes, 
    out ArrayList voidMethods){
    
    //initialize return lists
    namespaces = new ArrayList(); classes = new ArrayList(); 
    voidMethods = new ArrayList();
    //current method's header, or null if the method doesn't return "void"
    String currentMethod = String.Empty;

    //get the IL file line-by-line
    String[] lines = ReadFile(fileName);
    
    //loop over the lines of the IL file, fill lists
    for(int indexLines=0; indexLines<lines.Length; indexLines++){
        if(lines[indexLines].IndexOf(".namespace ") > 0){
            //found a namespace!
            namespaces.Add( ProcessNamespace(lines[indexLines]) );
        }
        else if(lines[indexLines].IndexOf(".class ") > 0){
            //found a class!
            classes.Add( ProcessClass(lines, ref indexLines) );
        }
        else if(lines[indexLines].IndexOf(".method ") > 0){
            //found a method!
            currentMethod = ProcessMethod(lines, ref indexLines);
            if(currentMethod != null){
                //method returns void - add to the list of usable methods
                voidMethods.Add(currentMethod);
            }
        }
    }
}

Given the number of usable methods, we can calculate the number of bytes per method:

//length of Unicode string + 1 position for this length 
//(it is hidden with the message)
float messageLength = txtMessage.Text.Length*2 +1;
//bytes to hide in each method, using only its first "ret" instruction
int bytesPerMethod = (int)Math.Ceiling( (messageLength / 
    (float)voidMethods.Count));

Now we are ready to begin. The method HideOrExtract uses the value of bytesPerMethod to insert the lines for one or more 4-byte blocks above each ret keyword.

/// <summary>Hides or extracts a message in/from an IL file</summary>
/// <param name="fileNameIn">Name of the IL file</param>
/// <param name="fileNameOut">Name for the output file - 
/// ignored if [hide] is false</param>
/// <param name="message">Message to hide, or empty stream to
/// store extracted message</param>
/// <param name="hide">true: hide [message]; false: extract
/// a message</param>
private void HideOrExtract(String fileNameIn, String fileNameOut, 
    Stream message, bool hide){
    if(hide){
        //open the destination file
        FileStream streamOut = new FileStream(fileNameOut, FileMode.Create);
        writer = new StreamWriter(streamOut);
    }else{
        //count of bytes hidden in each method is unknown,
        //it will be the first value to extract from the file
        bytesPerMethod = 0;
    }
    
    //read the source file
    String[] lines = ReadFile(fileNameIn);
    //no, we are not finished yet
    bool isMessageComplete = false;
    
    //loop over the lines
    for(int indexLines=0; indexLines<lines.Length; indexLines++){
        
        if(lines[indexLines].IndexOf(".method ") > 0){
            //found a method!
            if(hide){
                //hide as many bytes as needed
                isMessageComplete = ProcessMethodHide(lines, 
                   ref indexLines, message);
            }else{
                //extract all bytes hidden in this method
                isMessageComplete = ProcessMethodExtract(lines,
                    ref indexLines, message);
            }
        }else if(hide){
            //the line does not belong to a useable method - just copy it
            writer.WriteLine(lines[indexLines]);
        }
        
        if(isMessageComplete){
            break; //nothing else to do
        }
    }
    
    //close writer
    if(writer != null){ writer.Close(); }
}

Hiding the message

The method ProcessMethodHide copies the method's header, and checks if the return type is void. Then it looks for the last .locals init line. If no .locals init is found, the additional variable will be inserted at the beginning of the method. The hidden variable must be the last variable initialized in the method, because the compilers emitting IL Assembler Language often use slot numbers instead of names for local variables. Just imagine a desaster like that:

//a C# compiler emitted this code - it adds 5+2
//original C# code:
//int x = 5; int y = 2;
//mystaticval = x+y;
            
.locals init ([0] int32 x, [1] int32 y)
IL_0000:  ldc.i4.5
IL_0001:  stloc.0
IL_0002:  ldc.i4.2
IL_0003:  stloc.1
IL_0004:  ldloc.0
IL_0005:  ldloc.1
IL_0006:  add
IL_0007:  stsfld     int32 Demo.Form1::mystaticval
IL_000c:  ret

If we inserted an initialization at the beginning of the method, we could not re-assemble the code, because slot 0 is already in use by myvalue:

.locals init (int32 myvalue)
.locals init ([0] int32 x, [1] int32 y) //Error!
IL_0000:  ldc.i4.5
IL_0001:  stloc.0
...

So the additional local variables has to be initialized after the last existing .locals init. ProcessMethodHide inserts a new local variable, jumps to the first ret line and inserts ldc.i4/stloc pairs. The first value being hidden is the size of the message stream - the extracting method needs this value in order to know when to stop. The last value hidden in the first method is the count of message-bytes per method. It has to be placed right above the ret line, because the extracting method has to find it without knowing how many lines to go back (because that depends on just this value).

/// <summary>Hides one or more bytes from the message stream 
/// in the IL file</summary>
/// <param name="lines">Lines of the IL file</param>
/// <param name="indexLines">Current index in [lines]</param>
/// <param name="message">Stream containing the message</param>
/// <returns>true: last byte has been hidden; false:
/// more message-bytes waiting</returns>
private bool ProcessMethodHide(String[] lines, ref int indexLines,
    Stream message){
    bool isMessageComplete = false;
    int currentMessageValue,    //next message-byte to hide
        positionInitLocals,        //index of the last ".locals init" line
        positionRet,            //index of the "ret" line
        positionStartOfMethodLine; //index of the method's first line
    
    writer.WriteLine(lines[indexLines]); //copy first line
    
    //ignore if not a "void"-method
    if(lines[indexLines].IndexOf(" void ") > 0){
        //found a method with return type "void"
        //the stack will be empty at it's end,
        //so we can insert whatever we like

        indexLines++; //next line
        //search start of method block, copy all skipped lines
        int oldIndex = indexLines;
        SeekStartOfBlock(lines, ref indexLines);
        CopyBlock(lines, oldIndex, indexLines);
        
        //now we are at the method's opening bracket
        positionStartOfMethodLine = indexLines;
        //go to first line of the method
        indexLines++;
        //get position of last ".locals init" and first "ret"
        positionInitLocals = positionRet = 0;
        SeekLastLocalsInit(lines, ref indexLines, ref positionInitLocals, 
           ref positionRet);
        
        if(positionInitLocals == 0){
            //no .locals - insert line at beginning of method
            positionInitLocals = positionStartOfMethodLine;
        }

        //copy from start of method until last .locals, 
        //or nothing (if no .locals found)
        CopyBlock(lines, positionStartOfMethodLine, positionInitLocals+1);
        indexLines = positionInitLocals+1;
        //insert local variable
        writer.Write(writer.NewLine);
        writer.WriteLine(".locals init (int32 myvalue)");
        //copy rest of the method until the line before "ret"
        CopyBlock(lines, indexLines, positionRet);
        
        //next line is "ret" - nothing left to damage on the stack
        indexLines = positionRet;
        
        //insert ldc/stloc pairs for [bytesPerMethod] bytes
        //from the message stream
        //combine 4 bytes in one Int32
        for(int n=0; n<bytesPerMethod; n+=4){
            isMessageComplete = GetNextMessageValue(message, 
                out currentMessageValue);
            writer.WriteLine("ldc.i4 "+currentMessageValue.ToString());
            writer.WriteLine("stloc myvalue");
        }

        //bytesPerMethod must be last value in the first method
        if(! isBytesPerMethodWritten){
            writer.WriteLine("ldc.i4 "+bytesPerMethod.ToString());
            writer.WriteLine("stloc myvalue");
            isBytesPerMethodWritten = true;
        }

        //copy current line
        writer.WriteLine(lines[indexLines]);

        if(isMessageComplete){
            //nothing read from the message stream, the message is complete
            //copy rest of the source file
            indexLines++;
            CopyBlock(lines, indexLines, lines.Length-1);
        }
    }
    return isMessageComplete;
}

Extracting the hidden values

The method ProcessMethodExtract looks for the first ret line. If the number of bytes hidden in each method is still unknown, it jumps two lines back and extracts the number from the ldc.i4 line, which had been inserted as the last value in the first method. Otherwise it jumps back two lines per expected ldc.i4/stloc-pair, extracts the 4-byte blocks and writes them to the message stream. If an ldc.i4 is not found where it should be, the method throws an exception. The second extracted value (after the number bytes per method) is the length of the following message. When the message stream has reached this expected length, the isMessageComplete flag is set, HideOrExtract returns, and the extracted message is displayed. Extracting works just like hiding in reverse direction.

No key ?!

Sure you'll have noticed that this application doesn't use a key file to distribute the message. An intermediate assembly contains less void methods than an intermediate sentence contains characters, so a distribution key as it is used in all preceeding articles would only mean pushing loads of additional nonsense-lines into a few methods, and that would be much too obvious.
A key file for this application could specify how to disguise the values - debug output, operations, instance fields, additional methods, and so on. I'll add such a feature in future versions, if somebody is interested in it.

Warning

This application works with the assemblies I've testet, but is might as well fail with other assemblies. If you find an assembly which causes it to crash, please tell me about it and I'll see what I've done wrong.