Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C++

Tamper Aware and Self Healing Code

4.89/5 (44 votes)
15 Nov 2007CPOL25 min read 1   7.9K  
Dynamically Detect Code Alterations and Repair In-Memory Executable Files Using Hashing and Crypto++

Introduction

This article is the compliment to Dynamic TEXT Section Image Verification. The article will demonstrate detecting hardware faults or unauthorized patches; back patching the executable to embed the expected hash value of the .text section; and demonstrate the process of repairing the effects of hostile code (for example, an unauthorized binary patcher).

The ideas presented in this article work equally well whether the executable was patched on disk or in-memory. However, the self repair occurs in memory. In the context of Reverse Engineering and Patching, Kris Kaspersky (no relation to Kaspersky Labs namesake Eugene Kaspersky, a former KGB Cryptologist) refers to this as Online Patching in his book Hacker Disassembling Uncovered.

The samples will use a flat GZip'd file for storing a copy of the unaltered .text section. As Garth Lancaster points out, the reader should explore using executable resources to embed the hash or archived .text section. An example can be found in Adrian Cooper's Adding and Extracting Binary Resources.

In addition, this article will extend Shaun Wilde's article Compression and Decompression Using the Crypto++ Library. Finally, one should review Jessie Ezell's Two Simple Lines for Self-Repairing Apps for a [loosely] similar MSI solution.

The following topics will be visited:

  • Downloads
  • Tools
  • Compiling and Integrating Crypto++
  • SHA Hash Function
  • Self Integrity Checking
  • Self Healing Software
  • Compiler Back Patching
  • Error Free Hash Transcription
  • Hash Variable Placement and Initialization
  • Polling versus Notification
  • Seven Code Samples
  • Windows Vista Compatibility

The code presented in this article was successfully tested on Windows 2000, Windows XP, Windows Server 2003, and Windows Vista. Many thanks to Tim Deveaux and Joergen Sigvardsson for their assistance in testing the code against Windows Vista. Note that a standard user account successfully executed the demonstrated code under Windows Vista. See the 'Windows Vista Compatibility' Section at the end of the article for a discussion.

Downloads

There are 8 downloads available with this article. Loosely speaking the following concepts are introduced:

  • Self Healing 1 - Base Line (walking the EXE header)
  • Self Healing 2 - Hashing the .text section
  • Self Healing 3 - Self Healing 2 with Back Patching
  • Self Healing 4 - Extracts and Compresses the .text section
  • Self Healing 5 - Self Healing 4 with Back Patching
  • Self Healing 6 - Archived .text section Restoration, Back Patching
  • Self Healing 7 - Full Demonstration (Tampering and Healing)
  • RelExe - A Release Build Executable of Self Healing 7

Tools

Tool requirements for this article are the same as those in Dynamic TEXT Section Image Verification, though this article focuses less on previously demonstrated correctness. However, the compression routines using GZip and decompression routines using Gunzip from the Crypto++ library warrant a challenge.

The author's copy of WinZip 11.0 (Build 7313) claims the created archive is not valid (since the archive does not have a .gz extension). It appears WinZip relies solely on the file name extension. As an alternative, those who have WinRAR installed should find it a suitable replacement which works properly as it examines the file's header.

Compiling and Integrating Crypto++

Screenshot - CryptoLogo.png

Crypto++ can be downloaded from Wei Dai's Crypto++ page. For compilation and integration issues, see Compiling and Integrating Crypto++ into the Microsoft Visual C++ Environment. This article is based upon basic assumptions presented in the previously mentioned article.

For those who are interested in other C++ Cryptographic libraries, please see Peter Gutmann's Cryptlib or Victor Shoup's NTL.

SHA Hash Function

This article will use SHA-224. SHA-224 is in the family of SHA-2 hashes, currently recommended by NIST. The SHA-2 family of hashes produce a digest of at least 160 bits, which is the current best practice. In the case of SHA-224, the digest is 224 bits (28 bytes) in length.

The SHA-1 family of hashes are no longer approved for US Government use. See NIST's Cryptographic Toolkit, Secure Hashing.

For those who would like to use a flat C File and a non-NIST recommendation, the ISO recognizes RIPEMD and WHIRLPOOL (in addition to SHA). Both RIPEMD and WHIRLPOOL are implemented in Crypto++.

Self Integrity Checking

Computer viruses have employed integrity checking in the past, including the use of Hamming Codes to attack breakpoints and to correct errors. For example, the Yankee Doodle family of viruses introduced in 2000. Self Integrity Checking has also been a topic studied in academic circles. For an example, see J. Giffin, M. Christodorescu, and L. Kruger, Strengthening Software Self-Checksumming via Self-Modifying Code.

Microsoft offers a digital signature scheme for .NET assemblies called Strong-Named Assemblies. However, as Cracking .NET Assemblies demonstrates, the system is easily subverted. Self Healing is interleaved with the program, rather than existing as a shell around the program (Strong-Named Assemblies). By integrating the integrity check in the executable, it is hoped the system will be more difficult to remove.

For a survey of other techniques computer viruses use to prolong life, see Protection Schemes Based on Virus Survival Techniques.

Self Healing Software

It was suggested the article be named 'Tamper Aware and Self Repairing Code'. However, the author felt 'Self Repairing Code' was a bit sterile and detached — the MSI installer repairs. This code is much more tightly coupled to the programmer's work, so the metaphorical 'Self Healing Code' was used to embody the process.

Press Hype

At times there is much in the press on the topic of Self Healing Software, which would lead one to believe this area is thoroughly studied (and patented). Once investigated, the 'Self Healing Software' assigned by the press seems to be a bit of a misnomer.

For example, take the following press release uncovered by a Google search of 'Self Healing Software': Self-Healing Software Gets Push from IBM. One would expect to see an article describing software which could be flown on the Space Shuttle, have radiation flip bits in it's program code, and the software repair itself.

This is not the case. The IBM article discusses the capabilities of Tivoli Monitoring software (in the author's opinion it is a very nice product). In contrast to the title, IBM's statement in the article is:

...Tivoli Monitoring 6.1 oversees and fixes IT service-related problems in servers or databases for online applications such as e-mail

The thrust of this CodeProject article is software integrity and self repair, while the "IT service-related problems" mentioned in the Compuworld article discuss auto diagnosing and correcting issues such as those between an Email server and a Firewall.

Patent Issues

Dr. Brooke Stephens did uncover Patent 6530036, Self-Healing Computer System Storage. However, US Patent law being what it is, Patent 6530036 does not strictly apply to this CodeProject article. The holders of the patent restart their storage system should an anomaly be detected (the anomaly detection occurs through a proxy). The system described in this article recovers on the fly and does not use a proxy.

Self Healing Systems Workshop

In 2002, the first workshop on Self Healing systems was held in Charleston, South Carolina. The two papers of interest follow. However, neither system actively employs the system as program code (each use an 'external agent'). Both articles are available for download with this article.

The reader should keep in mind the author is neither a lawyer or programmer — he is a Network Engineer and Network Architect who has a passion for Programming and Cryptography. The compiler workings and caveats combined with the Penicillin Code presented in this article was an interesting application of Cryptography and made interesting reading.

Back Patching

In the most strict sense, back patching is an operation performed by the compiler during the compilation stage. This article will borrow the term since the article's endeavors are so closely related to the compiler's use of the term.

Consider the following code fragment which calculates parity:

C++
if( 0 == a % 2 )
{
    p = 0;
}
else
{
    p = 1;
}

On first pass, the compiler will encounter if( 0 == a % 2 ) and generate code to perform the comparison. Next, the assignments of either p = 0 or p = 1 are encountered. What will be generated is a compare instruction, and dropping into the first assignment, or a jump instruction (stepping over the first assignment) and performing the second assignment. The point to observe at his step is "how far to jump" is not known because the full if/else statement has not been evaluated. The disassembly of the above code is shown below (note that the code which is not relevant to this discussion — the modular reduction - has been masked).

Screenshot - SelfHealing16.png

On second pass, the code for the statements p = 0 and p = 1 has been generated (that is, the size of the emitted opcodes is now known), so the jump opcode at 0x411A16 can be patched with a displacement (more correctly, the immediate value of the operand can now be written). This is known as back patching.

Error Free Hash Transcription

These examples will require the reader to often copy the calculated hash into the expected hash. To this end, the following tip is presented. First, open the properties of the Windows NT interpreter.

Screenshot - SelfHealing29.png

Next, enable Quick Edit mode in the Windows NT interpreter.

Screenshot - SelfHealing30.png

With Quick Edit mode enabled, one can now:

  1. Insert the caret at first character of the hash
  2. Left mouse click and hold
  3. Highlight the hash text
  4. Release left mouse button
  5. Press ENTER to copy to the clipboard

Screenshot - SelfHealing31.png

Hash Variable Placement and Initialization

Though the issue of hash variable placement and initialization will not rise until examples two and three, it will be addressed now. There are two important caveats associated with variable placement and initialization.

Initialized Global Hash Variable

For the first caution, consider the following program fragment:

C++
BYTE cbExpectectedImageHash[ CryptoPP::SHA224::DIGESTSIZE ];

Notice that the BYTE array cbExpectedImageHash has been declared, but not initialized. This allocation will exist in the .bss section (uninitialized data section). The first run will produce the following result.

Screenshot - SelfHealing04.png

This run is expected in the Debug build, the compiler has initialized the BYTE array on behalf of the programmer to an expected value. Next, one would take the calculated image hash (09165E0392F4028240D0AEEA30B6CAF494CC929089757082347119ED), and use it to initialize cbExpectedImageHash as follows:

C++
BYTE cbExpectectedImageHash[ CryptoPP::SHA224::DIGESTSIZE ]; =
    { 0x09, 0x16, ..., 0x19, 0xED };

The effects of the above are subtle: The variable cbExpectedImageHash was moved from uninitialized data (.bss section) to initialized data (.data section).

In the interim the compiler has emitted different code: though cbExpectedImageHash will still exist in a DATA Segment (now the initialized data section), the instances have different initialization code, which will always reside in the .text section by default. Perhaps a simple

C++
memset( cbExpectectedImageHash, 0x00, sizeof( cbExpectectedImageHash ) );

when the data existed in the .bss section (uninitialized data section) has been removed. A second run of the above code would produce the following incorrect results:

Screenshot - SelfHealing05.png

A third run is required to properly calculate the precomputed hash.

Local Hash Variable

The final caveat in DATA Section initialization has to do with the placement of the hash variable on the stack. Simply put, one can back patch the executable as often as one desires between non Visual Studio (i.e. outside of the environment) runs, and one "will NOT obtain the correct results when the variable is not Global in scope." This is because cbExpectedHashImage resides on the programs stack, and the initialization code resides in the .text section. In the case of a command line project, the variable cbExpectedImageHash must be placed outside of main(). So the following will not produce expected results:

C++
int main(int argc, char* argv[])
{
    HMODULE hModule = NULL;
    PVOID   pVirtualAddress = NULL;
    PVOID   pCodeStart = NULL;
    PVOID   pCodeEnd = NULL;
    SIZE_T  dwCodeSize = 0;

    BYTE cbExpectedImageHash[ CryptoPP::SHA224::DIGESTSIZE ] =
        { 0x09,0x16,0x5E,0x03,0x92,0xF4,0x02,
          0x82,0x40,0xD0,0xAE,0xEA,0x30,0xB6,
          0xCA,0xF4,0x94,0xCC,0x92,0x90,0x89,
          0x75,0x70,0x82,0x34,0x71,0x19,0xED };

    BYTE cbCalculatedImageHash[ CryptoPP::SHA224::DIGESTSIZE ];

    ImageInformation( hModule, pVirtualAddress, pCodeStart,
                      dwCodeSize, pCodeEnd );

    DumpImageInformation( hModule, pVirtualAddress, pCodeStart,
                          dwCodeSize, pCodeEnd );
    ....

    return 0;
}

Analysis of Code Generations

Viewing the disassembly of the following trivial code reveals the reason for continual code generation changes when the BYTE array[] is placed inside main() — the encoding of the immediate value within the opcode.

C++
int main( )
{
    BYTE array [ 28 ] =
        { 0x00,0x01,0x02,0x03,0x04,0x05,0x06,
          0x00,0x01,0x02,0x03,0x04,0x05,0x06,
          0x00,0x01,0x02,0x03,0x04,0x05,0x06,
          0x00,0x01,0x02,0x03,0x04,0x05,0x06 };

    return 0;
}

Screenshot - SelfHealing25.png

To understand why a global variable does not cause the code generation issue above, one can use PE Browse to examine the .data section (initialized data section) of the executable for the following example.

C++
BYTE array [ 28 ] =
     { 0x00,0x01,0x02,0x03,0x04,0x05,0x06,
       0x00,0x01,0x02,0x03,0x04,0x05,0x06,
       0x00,0x01,0x02,0x03,0x04,0x05,0x06,
       0x00,0x01,0x02,0x03,0x04,0x05,0x06 };

int main()
{
    return 0;
}

Notice below that the array is now stored in the .data section, rather than a collection of immediate value opcodes or the .bss section (uninitialized data section). Recall this article does not hash data sections — only the distinguished .text section. This is the "allocation and initialization" of array. Hence the reason there is no changing .text section code.

Below is a view of the .data section when examining the executable using PE Browse.

Image 9

If one were to hover the mouse over the variable array in the Visual Studio Debugger, Intellisense would report the address of array as 0x408030. If one were to accidentally overflow array memory, the first byte to be overwritten would be at 0x40805C — the byte value of 0xA0.

Polling Versus Notification

This article uses Crypto++ and hashing to determine when the .text section has been modified in memory through Polling. It appears Polling is the only option available to a programmer. An obvious point to observe: if triggering is possible, an executable which has had an unauthorized patch applied on disk will not trigger an event.

Windows API

If Microsoft Windows provided the programmer with a memory write notification (into the .text section) API, one could simply wait for the trigger and inject the Penicillin Code as required. According to Dr. Newcomer, Microsoft MVP, such a notification is not available.

Debug Registers

As Matthew Faithfull points out (reiterated by Oleg Starodumov above), under the Visual Studio Debugger, one can set a hardware breakpoint to accomplish the task for data. In Debugging Applications, John Robbin's presents the source code for a debugger. However, the program uses software breakpoints and not hardware breakpoints.

The following is concluded from Intel Architecture Software Developer's Manual Volume 3: System Programming. Chapter 15 Debugging and Performance Monitoring indicates that hardware support for notification is not possible for two reasons:

  • the hardware breakpoint is not effective for .text segment writes
  • using a hardware breakpoint, one can only specify a length of up to 4 bytes to participate in monitoring

Guard Pages

According to Slava Usov (in a post to microsoft.public.win32.programmer.kernel), notification may be possible using VirtualQueryEx() and VirtualProtectEx(). See Creating Guard Pages in MSDN for the full discussion. As a convenience to the reader the steps are reproduced below.

  1. Run the application under (your own) debugger.
  2. Use VirtualQueryEx() to get the current page protection of the memory location in question.
  3. Use VirtualProtectEx() to change protection of the memory location to current_page_protection | PAGE_GUARD
  4. Look for exceptions [WaitForDebugEvent()] with code STATUS_GUARD_PAGE and the address belonging to the memory location. (STATUS_GUARD_PAGE is not defined in the include files (I wish I knew why); its numerical value is 0x80000001.)
  5. Once you (your debugger) receive such an exception, do what you want to do, then use SetThreadContext() to set the thread to single step execution, then dispose of the debug event [ContinueDebugEvent() with DBG_CONTINUE]. If the target process is multi-threaded, you should suspend all the other threads (otherwise the other threads may access the memory location without you seeing that).
  6. Wait for EXCEPTION_SINGLE_STEP exception, after which call ContinueDebugEvent() with DBG_CONTINUE and go to step 2. If you suspended threads at the previous step, resume them now.

For the purpose of this article, the exception of interest would be STATUS_GUARD_PAGE_VIOLATION.

Self Healing 1

Self Healing 1 is taken from Dynamic TEXT Section Image Verification. It is a basic rewrite (which should have been performed in the previous article) — primarily a copy and paste to rearrange the executable for functionality and aesthetics. It will serve as the starting point of this article.

Screenshot - SelfHealing01.png

The function of interest from Dynamic TEXT Section Image Verification for example one is

C++
VOID ImageInformation( HMODULE& hModule, PVOID& pVirtualAddress,
                       PVOID& pCodeStart, SIZE_T& dwCodeSize,
                       PVOID& pCodeEnd )

ImageInformation() populates the parameters for use later in the program by locating the start of the TEXT Section in memory, by combining the address returned from GetModuleHandle(), and parsing the various headers.

The sample then dumps the byte codes encountered by reading the in-memory .text section using standard memory read functions — note there is no requirement for MapViewOfFile() or ReadProcessMemory() since the operations are within the confines of its own process.

Self Healing 2

The sample provided in Self Healing 2 builds upon the previous example by adding a Cryptographic Hash function. The Hash Function creates a digest of the executable's .text section.

Screenshot - SelfHealing02.png

The code's data was modified by adding two BYTE arrays for SHA-224 hash of the .text section: the expected (precalculated) hash, and the calculated (runtime) hash.

In addition to the BYTE arrays for the hash, a hash object and code to perform the hashing was added. This code can be seen below.

C++
VOID CalculateImageHash( PVOID pCodeStart, SIZE_T dwCodeSize,
                         PBYTE pcbDigest )
{
    CryptoPP::SHA224 hash;

    hash.Update( (PBYTE)pCodeStart, dwCodeSize );
    hash.Final( pcbDigest );
}

To build an executable which functions properly requires two compilations: the first compilation and subsequent run generates the expected (now known as the precalculated) hash. Then the precalculated hash is added to the executable. Finally, a second run will result in the precalculated hash equalling the runtime hash.

As Dynamic TEXT Section Image Verification demonstrated, one can use either the on-disk .text section image, or the in-memory .text section image. The .text images are identical.

"Note that the operation of running the executable under the Debugger will cause the hash to change." This is because the Visual Studio Debugger will insert software breakpoints (0xCC opcode or Interrupt 3) into the program. To compound this issue, the software breakpoints are not displayed when viewing a disassembly. According to Oleg Starodumov, Microsoft VC++ MVP:

[The] Visual Studio debugger can only use hardware breakpoints on data access
(only for write). If you need to break when the code executes, consider WinDbg.

Finally, taking from Ken Johnson, Microsoft SDK MVP:

...in WinDbg, if you use the 'ba' command then the code bytes in question will not be modified (i.e. substituted with an 0xcc/int 3). You are limited to 4 simultaneously active 'ba' breakpoints as they use the hardware supplied debug registers, which only support four target addresses.

Because of the Visual Studio software breakpoint issue, the program was built and then run from the Command Line rather than the Visual Studio environment. This is readily apparent if the reader observes the change in the Title Bar text.

In the two images below, Self Healing 2 was: run once from the Visual Studio Environment (yellow text); and run once from the Command Line (green text) to demonstrate the breakpoint issue. In either case, the code is exactly the same.

Screenshot - SelfHealing20.png

Screenshot - SelfHealing21.png

The above code was run to create the precalculated hash. In the intermeditate step between running the program the first time and the second time, one would back patch the executable to populate the correct expected digest. The code of Self Healing 2 is displayed below before the first run.

C++
BYTE cbExpectedImageHash[ CryptoPP::SHA224::DIGESTSIZE ] =
    { 0x00,0x01,0x02,0x03,0x04,0x05,0x06,
      0x00,0x01,0x02,0x03,0x04,0x05,0x06,
      0x00,0x01,0x02,0x03,0x04,0x05,0x06,
      0x00,0x01,0x02,0x03,0x04,0x05,0x06, };

BYTE cbCalculatedImageHash[ CryptoPP::SHA224::DIGESTSIZE ];

int main(int argc, char* argv[])
{
    HMODULE hModule = NULL;
    PVOID   pVirtualAddress = NULL;
    PVOID   pCodeStart = NULL;
    PVOID   pCodeEnd = NULL;
    SIZE_T  dwCodeSize = 0;

    ImageInformation( hModule, pVirtualAddress, pCodeStart,
                      dwCodeSize, pCodeEnd );

    DumpImageInformation( hModule, pVirtualAddress, pCodeStart,
                          dwCodeSize, pCodeEnd );

    HexDump( pCodeStart, pCodeStart, DUMP_SIZE );

    CalculateImageHash( pCodeStart, dwCodeSize, cbCalculatedImageHash );

    DumpHash( cbExpectedImageHash, CryptoPP::SHA224::DIGESTSIZE,
              "SHA-224 Expected Image Hash" );
    DumpHash( cbCalculatedImageHash, CryptoPP::SHA224::DIGESTSIZE,
              "SHA-224 Calculated Image Hash" );

    if( 0 == memcmp( cbExpectedImageHash, cbCalculatedImageHash,
                     CryptoPP::SHA224::DIGESTSIZE ) )
    {
        std::cout << "Image is verified." << std::endl;
    }
    else
    {
        std::cout << "Image has been modified." << std::endl;
    }

    return 0;
}

Armed with the correct expected hash value (E259A10464E487076CDB8F83E6D06ACB53564A1684BA84B3ABA72F4B), one can now insert it into the code for proper initialization of cbExpectedImageHash as shown below.

C++
BYTE cbExpectedImageHash[ CryptoPP::SHA224::DIGESTSIZE ] =
     { 0xE2,0x59,0xA1,0x04,0x64,0xE4,0x87,
       0x76,0xCD,0xB8,0xF8,0x3E,0x6D,0x06,
       0xAC,0xB5,0x35,0x64,0xA1,0x68,0x4B,
       0x16,0x84,0xB3,0xAB,0xA7,0x2F,0x4B, };

A command line run of Self Healing 3 (after a small code change — hence the different hash) demonstrates the expected results.

Screenshot - SelfHealing03.png

The correct code is demonstrated below. The BYTE array is located Globally, and either:

  • Populated with a place holder (0x00, 0x01, ..., 0x05, 0x06 - repeated four times)
  • Populated with the correct hash (0x09, 0x16, ... 0xED)

Again, this pre-population or back patching ensures compiler generated code is consistent from compiler invocation to compiler invocation.

C++
// Self Healing 2.cpp

#include "stdafx.h"

#include "sha.h"        // SHA
#include "hex.h"        // HexEncoder
#include "filters.h"    // StringSink

VOID ImageInformation( HMODULE& hModule, PVOID& pVirtualAddress,
                       PVOID& pCodeStart, SIZE_T& dwCodeSize,
                       PVOID& pCodeEnd );

VOID DumpImageInformation( HMODULE hModule, PVOID pVirtualAddress,
                           PVOID pCodeStart, SIZE_T dwCodeSize,
                           PVOID pCodeEnd );

VOID CalculateImageHash( PVOID pCodeStart, SIZE_T dwCodeSize,
                         PBYTE pcbDigest );

VOID DumpHash( PBYTE pcbDigest, SIZE_T dwSize, std::string message );

VOID HexDump( LPCVOID pcbStartAddress,
              LPCVOID pDisplayBaseAddress = (PVOID)-1,
              DWORD dwSize = DEFAULT_DUMP_SIZE );

// These values must be Global. Place them inside
//   main(), and you get different code generation.
BYTE cbExpectedImageHash[ CryptoPP::SHA224::DIGESTSIZE ] =
    { 0x09,0x16,0x5E,0x03,0x92,0xF4,0x02,
      0x82,0x40,0xD0,0xAE,0xEA,0x30,0xB6,
      0xCA,0xF4,0x94,0xCC,0x92,0x90,0x89,
      0x75,0x70,0x82,0x34,0x71,0x19,0xED };

BYTE cbCalculatedImageHash[ CryptoPP::SHA224::DIGESTSIZE ];

int _tmain(int argc, _TCHAR* argv[])
{
    HMODULE hModule = NULL;
    PVOID   pVirtualAddress = NULL;
    PVOID   pCodeStart = NULL;
    PVOID   pCodeEnd = NULL;
    SIZE_T  dwCodeSize = 0;

    ImageInformation( hModule, pVirtualAddress, pCodeStart,
                      dwCodeSize, pCodeEnd );

    DumpImageInformation( hModule, pVirtualAddress, pCodeStart,
                          dwCodeSize, pCodeEnd );

    HexDump( pCodeStart, pCodeStart, DUMP_SIZE );

    CalculateImageHash( pCodeStart, dwCodeSize, cbCalculatedImageHash );

    DumpHash( cbExpectedImageHash, CryptoPP::SHA224::DIGESTSIZE,
              "SHA-224 Expected Image Hash" );
    DumpHash( cbCalculatedImageHash, CryptoPP::SHA224::DIGESTSIZE,
              "SHA-224 Calculated Image Hash" );

    if( 0 == memcmp( cbExpectedImageHash, cbCalculatedImageHash,
        CryptoPP::SHA224::DIGESTSIZE ) )
    {
        std::tcout << _T("Image is verified.") << std::endl;
    }
    else
    {
        std::tcout << _T("Image has been modified.") << std::endl;
    }

    return 0;
}

VOID DumpHash( PBYTE pcbDigest, SIZE_T dwSize, std::string message )
{
    CryptoPP::HexEncoder encoder;
    std::string sink;

    encoder.Attach( new CryptoPP::StringSink (sink) );
    encoder.Put( pcbDigest, dwSize );
    encoder.MessageEnd();

    std::cout << std::endl;

    if( 0 != message.length() )
    {
        std::cout << message << std::endl;
    }

    std::cout << sink << std::endl << std::endl;
}

VOID CalculateImageHash( PVOID pCodeStart, SIZE_T dwCodeSize,
                         PBYTE pcbDigest )
{
    CryptoPP::SHA224 hash;

    hash.Update( (PBYTE)pCodeStart, dwCodeSize );
    hash.Final( pcbDigest );
}

VOID DumpImageInformation( HMODULE hModule, PVOID pVirtualAddress,
                           PVOID pCodeStart, SIZE_T dwCodeSize,
                           PVOID pCodeEnd )
{
    std::tcout << _T("****************************************************");
    std::tcout << _T("************* Memory Image Information *************");
    std::tcout << _T("****************************************************");
    std::tcout << std::endl;

    std::tcout << _T("         hModule: ");
    std::tcout << HEXADECIMAL_OUTPUT(8);
    std::tcout << hModule << std::endl;

    ...

    std::tcout << std::endl;
}

VOID ImageInformation( HMODULE& hModule, PVOID& pVirtualAddress,
                       PVOID& pCodeStart, SIZE_T& dwCodeSize,
                       PVOID& pCodeEnd )
{
    const UINT PATH_SIZE = 2 * MAX_PATH;
    TCHAR szFilename[ PATH_SIZE ] = { 0 };

    __try {

        /////////////////////////////////////////////////
        /////////////////////////////////////////////////
        if( 0 == GetModuleFileName( NULL, szFilename, PATH_SIZE ) )
        {
            std::tcerr << _T("Error Retrieving Process Filename");
            std::tcerr << std::endl;
            __leave;
        }

        hModule = GetModuleHandle( szFilename );

        ...
    }
}

Other notes of interest are as follows:

  • HexEncoder will transform the BYTE array to the std::string
  • StringSink is the library's built in mechanism to send data (in this case the human readable string) to an object (std::string)
  • Attach() is the library's method for attaching the StringSink on the fly
  • Put() is the library's method for pushing data into the object (HexEncoder)
  • MessageEnd() informs the encoder to complete it's operations and flush it's buffers

Self Healing 3

Though introduced previously, Self Healing 3 is a proper run of sample 2 from the command line outside the debugger with the expected image hash variable back patched (and in Global scope).

Self Healing 4

The fourth sample code is the code to extract and compress the unmodified .text section from the executable. For this portion of the article, the compressed .text section will be saved to a file named TextImage.gz.

The extracted and compressed .text section is the data which is subsequently restored, should one detect a load error or unauthorized memory patch. The reader should explore other means for storing the extracted and compressed data. Candidates include:

  • As an Executable's Resource
  • As a Resource DLL
  • As a File
  • In the Windows Registry

As far as the candidates stand, the Windows Registry is probably the least desirable (this is not the case if one chooses to embed the expected hash value as the hashes will be fewer than 32 bytes). Microsoft recommends a limit of approximately 2048 bytes of data. Please read Microsoft's Registry Element Size Limit in MSDN.

The steps for creating the various resources can be found at Creating a Resource DLL and Creating a Resource-Only DLL. It is left to the reader as an exercise if chosen.

A flat file was chosen for simplicity, functionality, and to demonstrate the Crypto++ Gzip and Gunzip classes.

This sample simply takes the in memory .text section, compresses it, and writes it to a file. Self Healing 4 is examined in detail under the next section, after back patching has occurred. For completeness, the Command Line run is shown below. Note the place holder for the expected hash: 0x00, 0x01, ..., 0x05, 0x06 to assure consistent code generation across runs for the back patch.

Screenshot - SelfHealing07.png

Self Healing 5

Self Healing 5 performs the TEXT Section export after compressing the image. Note that back patching has occurred.

Screenshot - SelfHealing08.png

Since the program is being run from the Command Line, the interpreter may have a pwd — or present working directory — different than that of the program directory. In this case, pwd is C:\. As such, the archive is placed in C:\ rather than in the program's build directory.

Navigating WinRAR to the root of C:\ and opening the archive reveals a consistent TEXT Section image. Taking from the information dumped in the fifth sample, the .text section size is 0x17FCE5, which is 1,572,069 decimal bytes.

Screenshot - SelfHealing09.png

The final step to be performed is extracting TextImage.gz, and then opening the extracted file using a hex editor to verify the correctness of the compression and extraction operations. This is verified below using UltraEdit32.

Screenshot - SelfHealing10.png

The code in this example adds one function call as follows. pCodeStart and dwCodeSize are being used from ImageInformation().

C++
std::string filename = "TextImage.gz";
ExportTextImage( filename, pCodeStart, dwCodeSize );

Finally, the Crypto++ code to create the archive:

C++
CryptoPP::Gzip zipper(
    new CryptoPP::FileSink (filename.c_str(), true ),
CryptoPP::Gzip::MAX_DEFLATE_LEVEL ); // Gzip

zipper.Put( (byte*)pCodeStart, dwCodeSize );
zipper.MessageEnd( );

The GZip constructor takes a BufferdTransformation* (the FileSink object), and a Deflate level as parameters. The documented constructors being used Gzip and FileSink are as follows. Reference the Gzip and FileSink class in the Crypto++ manual.

C++
Gzip (BufferedTransformation *attachment=NULL,
    unsigned int deflateLevel=DEFAULT_DEFLATE_LEVEL,
    unsigned int log2WindowSize=DEFAULT_LOG2_WINDOW_SIZE,
    bool detectUncompressible=true);

C++
FileSink (const char *filename, bool binary=true);

Then one encounters the Put() and MessageEnd() functions previously encountered in the HexEncoder. Different objects (Gzip vs. HexEncoder), same results — the data is pushed into the object, processed, and then the object is informed to complete it's operations and flush its buffers.

It is noteworthy to mention that FileSink, HexEncoder, and Gzip (among others) each have a common ancestor: the BufferedTransformation. This is the foundation of Pipelining or Chaining in Crypto++. For a more detailed discussion of the topic, see Product Keys Based on the Advanced Encryption Standard.

Self Healing 6

Sample 6 is rather boring — it simply reads the reads the TEXT Section archive, places it in a buffer (a rather larger buffer in Debug builds for a Command Line project), and dumps the first 96 bytes to compare with the original TEXT Section. This sample is presented after the back patching operation (back patching was performed in Samples 2 through 5, essentially making one example into two).

Screenshot - SelfHealing11.png

The Gunzip code is shamelessly ripped from Wei's Crypto++, test.cpp (with the addition of wrapping in a try/catch block):

C++
VOID ImportTextImage( const std::string& filename,
                      PBYTE pBuffer, SIZE_T dwBufferSize )
{
    try {

        std::string RecoveredTextSection;
        CryptoPP::FileSource( filename.c_str(), true,
            new CryptoPP::Gunzip(
                new CryptoPP::StringSink( RecoveredTextSection )
            ) // Gunzip
        ); // FileSource

        if( RecoveredTextSection.length() > dwBufferSize )
        {
            std::tcerr << _T("ImportTextImage: Executing Buffer Overflow");
        }

        memcpy( pBuffer, RecoveredTextSection.c_str(), dwBufferSize );
    }

    catch( CryptoPP::Exception& e )
    {
        std::cerr << e.what() << std:: endl;
    }

    catch( ... )
    {
        std::tcerr << _T("Caught Unknown Exception");
        std::tcerr << std:: endl;
    }
}

The following code snippet and figure of a Release build run (using Green text) demonstrates using conditional compilation based on _DEBUG. The programmer now enjoys 4 build for a Debug and Release pair. Also noteworthy is the dramatic reduction in .text section size for the Release build: 0x40130 or 262,448 decimal bytes. After compression, this is 135,687 decimal bytes.

C++
#ifdef _DEBUG
BYTE cbExpectedImageHash[ CryptoPP::SHA224::DIGESTSIZE ] =
    { 0x12,0xE3,0x39,0xB5,0xBF,0xF8,0xEF,
      0xAD,0xEF,0x2A,0x6F,0xA8,0x6E,0x04,
      0x7D,0x27,0xF9,0xA6,0x18,0x1F,0x9A,
      0x45,0x38,0x57,0xCE,0x14,0xFD,0xF7 };
#else
BYTE cbExpectedImageHash[ CryptoPP::SHA224::DIGESTSIZE ] =
    { 0x10,0xBD,0x17,0xD7,0x86,0x83,0xB9,
      0x55,0xA5,0x20,0xDC,0x0B,0x30,0x6F,
      0x14,0x19,0x06,0xEE,0x25,0x02,0xEE,
      0xE7,0x95,0x1F,0x6A,0x6B,0x5A,0x25 };
#endif

Screenshot - SelfHealing12.png

Self Healing 7

Self Healing 7 is the final proof of concept in this article. The program performs the following steps (as in example six, back patching has been performed previously):

  1. Dump Original .text
  2. Dump Archived .text
  3. Compare Hashes
  4. Alter one byte
  5. Dump Altered .text
  6. Compare Hashes
  7. Restore Archived .text (no a priori knowledge)
  8. Dump Healed (Repaired) .text
  9. Compare Hashes

In this sample the following code will not work, so WriteProcessMemory() will be used.

C++
((PBYTE)pCodeStart)[ 0 ] = 0x90;  // No Operation (and flush the CPU's cache)

Screenshot - SelfHealing13.png

Once switching to WriteProcessMemory() for the Tampering (1 byte), the sample again uses the function for Healing. However, the entire .text section is restored. The reason for the extreme restoration is that the author spent considerable time attempting to perform both functional level detection and restoration.

It is felt functional level detection and restoration can be performed, but not without a dynamic dissembler. This is clearly feasible, since SoftICE (among other debuggers) has the feature. With that said, Russell Osterlund respectfully declined to share his source code for PEBrowse.

What does not perform as expected in Debug builds is the address (&) operator and book-ending the function. Consider the following code fragment:

C++
void main()
{
    Function1();

    Function2();
}

void Function1() { ... }

void Function2() { ... }

The start of main() can be determined with &main(); conversely the address of the first function is &Function1(). One would then incorrectly conclude sizeof( main ) is a difference of the addresses.

In Debug builds &main() will return the address of a jump stub (for a discussion, see GetAddressOfMain() in Dynamic TEXT Section Image Verification). Next, there is no guarantee the binary layout of the Debug or Release build mimic that of the source file. Finally, function inlining in Release builds could optimize away the function call alltogether.

The reader is encouraged to further this work by creating a deterministic method for both functional level detection and restoration.

The results of the Debug (blue text) and Release (green text) executions are shown below.

Screenshot - SelfHealing14.png

Screenshot - SelfHealing15.png

The functions of interest are now AlterTextImage() and HealTextImage(). AlterImageText() simply writes one No Operation instruction to the first byte of the .text section:

C++
VOID AlterTextImage( LPVOID pcbStartAddress, BYTE OpCode )
{
    HANDLE hProcess = NULL;
    BOOL bResult = FALSE;
    SIZE_T dwBytesWritten = 0;

    __try
    {
        hProcess = OpenProcess( PROCESS_VM_OPERATION | PROCESS_VM_WRITE,
                                FALSE, GetCurrentProcessId() );

        if( NULL == hProcess )
        {
            std::tcerr << std::endl;
            std::tcerr << _T("Unable to Open Process");
            std::tcerr << std::endl;
            __leave;
        }

        bResult = WriteProcessMemory( hProcess, pcbStartAddress,
                        &OpCode, sizeof( OpCode ), &dwBytesWritten );

        if( FALSE == bResult || 1 != dwBytesWritten )
        {
            std::tcerr << std::endl;
            std::tcerr << _T("Unable to Alter .text Section");
            std::tcerr << std::endl;
        }
    }

    __except( EXCEPTION_EXECUTE_HANDLER ) {
        std::tcerr << std::endl;
        std::tcerr << _T("Caught Exception in AlterTextImage");
        std::tcerr << std::endl;
    }

    if( NULL != hProcess ) { CloseHandle( hProcess ); }
}

And the corresponding Penicillin Code of HealTextImage(). Note that this code rewrites the entire .text section with a known good copy:

C++
VOID HealTextImage( LPVOID pStartAddress, LPCVOID pArchivedText,
                            SIZE_T dwSize )
{
    HANDLE hProcess = NULL;
    BOOL bResult = FALSE;
    SIZE_T dwBytesWritten = 0;

    __try
    {
        hProcess = OpenProcess( PROCESS_VM_OPERATION | PROCESS_VM_WRITE,
                                FALSE, GetCurrentProcessId() );

        if( NULL == hProcess )
        {
            std::tcerr << std::endl;
            std::tcerr << _T("Unable to Open Process");
            std::tcerr << std::endl;
            __leave;
        }

        bResult = WriteProcessMemory( hProcess, pStartAddress,
                        pArchivedText, dwSize, &dwBytesWritten );

        if( FALSE == bResult || dwSize != dwBytesWritten )
        {
            std::tcerr << std::endl;
            std::tcerr << _T("Unable to Heal .text Section");
            std::tcerr << std::endl;
        }
    }

    __except( EXCEPTION_EXECUTE_HANDLER ) {
        std::tcerr << std::endl;
        std::tcerr << _T("Caught Exception in HealTextImage");
        std::tcerr << std::endl;
    }

    if( NULL != hProcess ) { CloseHandle( hProcess ); }
}

For completeness, the main() function of Example 7 is reproduced. Program flow in the code below corresponds to the following (previously outlined):

  1. Dump Original .text
  2. Dump Archived .text
  3. Compare Hashes
  4. Alter one byte
  5. Dump Altered .text
  6. Compare Hashes
  7. Restore Archived .text (no a priori knowledge)
  8. Dump Healed (Repaired) .text
  9. Compare Hashes

C++
//  These values must be Global. Place them inside
//  main(), and you get different code generation
//  after each back patch operation.
#ifdef _DEBUG
BYTE cbExpectedImageHash[ CryptoPP::SHA224::DIGESTSIZE ] =
    { 0xF2,0x1A,0xCF,0x46,0x53,0xAB,0x47,
      0x02,0xD5,0x00,0x24,0xBC,0xF8,0xA1,
      0x8E,0xD6,0xFF,0xFF,0x60,0x06,0x18,
      0x01,0x85,0x70,0x83,0x46,0x7C,0x4F };
#else
BYTE cbExpectedImageHash[ CryptoPP::SHA224::DIGESTSIZE ] =
    { 0x44,0x23,0x76,0xCF,0x3C,0x5E,0x7C,
      0x7B,0x81,0x86,0xAA,0x23,0xD7,0x59,
      0xFE,0x21,0xF6,0xB9,0xCB,0x52,0x11,
      0x0A,0x9F,0x63,0xB8,0x7F,0xF8,0x70 };
#endif

BYTE cbCalculatedImageHash[ CryptoPP::SHA224::DIGESTSIZE ];

int _tmain(int argc, _TCHAR* argv[])
{
    HMODULE hModule = NULL;
    PVOID   pVirtualAddress = NULL;
    PVOID   pCodeStart = NULL;
    PVOID   pCodeEnd = NULL;
    SIZE_T  dwCodeSize = 0;


    // Set Up - Develop EXE Information
    ImageInformation( hModule, pVirtualAddress, pCodeStart,
                      dwCodeSize, pCodeEnd );

    // Set Up - Dump Information
    DumpImageInformation( hModule, pVirtualAddress, pCodeStart,
                          dwCodeSize, pCodeEnd );

    // Set Up - Export .text Section
    std::string filename = "TextImage.gz";
    ExportTextImage( filename, pCodeStart, dwCodeSize );

    // Set Up - Import .text Section
    SIZE_T dwBufferSize = dwCodeSize;
    PBYTE pArchiveBuffer = new BYTE[ dwBufferSize ];
    if( NULL == pArchiveBuffer ) { return -1; }
    ImportTextImage( filename, pArchiveBuffer, dwBufferSize );

    // Step 1: Dump Original .text
    std::tcout << _T("Original TEXT Section") << std::endl;
    HexDump( pCodeStart, pCodeStart, DUMP_SIZE );
    std::tcout << std::endl;

    // Step 2: Dump Archived .text
    std::tcout << _T("Archived TEXT Section") << std::endl;
    HexDump( pArchiveBuffer, (LPCVOID)NULL, DUMP_SIZE );
    std::tcout << std::endl;

    // Set Up: Calculate Hashes
    CalculateImageHash( pCodeStart, dwCodeSize, cbCalculatedImageHash );

    // Step 3: Compare Hashes
    std::tcout << std::endl;
    DumpHash( cbExpectedImageHash, CryptoPP::SHA224::DIGESTSIZE,
              "SHA-224 Expected Image Hash" );
    DumpHash( cbCalculatedImageHash, CryptoPP::SHA224::DIGESTSIZE,
              "SHA-224 Calculated Image Hash" );

    // Step 3: Compare Hashes
    if( 0 == memcmp( cbExpectedImageHash, cbCalculatedImageHash,
        CryptoPP::SHA224::DIGESTSIZE ) )
    {
        std::tcout << _T("Image is verified.") << std::endl;
    }
    else
    {
        std::tcout << _T("Image has been modified.") << std::endl;
    }
    std::tcout << std::endl;
    std::tcout << "================================" << std::endl;

    // Step 4: Alter Bytes
    AlterTextImage( pCodeStart, 0x90 );

    // Step 5: Dump Altered .text
    std::tcout << _T("Altered TEXT Section") << std::endl;
    HexDump( pCodeStart, pCodeStart, DUMP_SIZE );
    std::tcout << std::endl;

    // Set Up: Calculate Hashes
    CalculateImageHash( pCodeStart, dwCodeSize, cbCalculatedImageHash );

    // Step 6: Compare Hashes
    std::tcout << std::endl;
    DumpHash( cbExpectedImageHash, CryptoPP::SHA224::DIGESTSIZE,
              "SHA-224 Expected Image Hash" );
    DumpHash( cbCalculatedImageHash, CryptoPP::SHA224::DIGESTSIZE,
              "SHA-224 Calculated Image Hash" );

    // Step 6: Compare Hashes
    if( 0 == memcmp( cbExpectedImageHash, cbCalculatedImageHash,
        CryptoPP::SHA224::DIGESTSIZE ) )
    {
        std::tcout << _T("Image is verified.") << std::endl;
    }
    else
    {
        std::tcout << _T("Image has been modified.") << std::endl;
    }
    std::tcout << std::endl;
    std::tcout << "================================" << std::endl;

    // Step 7: Heal the Code
    //   We should not fall through this
    if( 0 != memcmp( cbExpectedImageHash, cbCalculatedImageHash,
        CryptoPP::SHA224::DIGESTSIZE ) )
    {
        HealTextImage( pCodeStart, pArchiveBuffer, dwBufferSize );
    }

    // Step 8: Dump Healed .text
    std::tcout << _T("Healed TEXT Section") << std::endl;
    HexDump( pCodeStart, pCodeStart, DUMP_SIZE );
    std::tcout << std::endl;

    // Set Up: Calculate Hashes
    CalculateImageHash( pCodeStart, dwCodeSize, cbCalculatedImageHash );

    // Step 9: Compare Hashes
    std::tcout << std::endl;
    DumpHash( cbExpectedImageHash, CryptoPP::SHA224::DIGESTSIZE,
              "SHA-224 Expected Image Hash" );
    DumpHash( cbCalculatedImageHash, CryptoPP::SHA224::DIGESTSIZE,
              "SHA-224 Calculated Image Hash" );

    // Step 9: Compare Hashes
    if( 0 == memcmp( cbExpectedImageHash, cbCalculatedImageHash,
        CryptoPP::SHA224::DIGESTSIZE ) )
    {
        std::tcout << _T("Image is verified.") << std::endl;
    }
    else
    {
        std::tcout << _T("Image has been modified.") << std::endl;
    }

    // Cleanup
    if( NULL != pArchiveBuffer ) { delete[] pArchiveBuffer; }

    return 0;
}

Best practices would dictate that one verify the integrity of the archived copy of the .text section before performing the above operation (perhaps using a hash). An even better solution would be to digitally sign the archived code, so that there would be no way to forge the hash or the enclosed Penicillin Code without detection. For an example of Message Signing with Recovery, see Product Activation Based on RSA Signatures. The exercises are left to the reader.

Windows Vista Compatibility

The author is very pleased to report these techniques are 100% Windows Vista Compatible. One minor issue was encountered: TextImage.gz could not be created in C:\ when running the program under a standard user account.

Screenshot - SelfHealing42.png

The Windows Message Box is being invoked because one has written garbage into the .text section — recall that the archive creation and subsequent restoration failed. Admittedly, the author should have placed more checks in the demonstration code.

If the archive file existed (from a previous run under an Administrator account), the program worked as desired.

Screenshot - SelfHealing41.png

This is because without virtualization, Local Users (of which an Authenticated User is a member) are allowed three permissions by default. Note that this computer is part of a private Domain (home.pvt):

Screenshot - SelfHealing40.png

Notice that Write permission is not enabled by default. For a discussion of Microsoft's latest attempt at user security, see Mark Russinovich's Inside Windows Vista User Account Control.

Acknowledgements

  • Wei Dai for Crypto++ and his invaluable help on the Crypto++ mailing list
  • Dr. A. Brooke Stephens who laid my Cryptographic foundations

Revisions

  • 11.05.2007 Added Self Integrity Checking Section
  • 06.14.2007 Revised 'Self Healing Software' Section
  • 06.01.2007 Expanded 'Vista Compatibility' Section
  • 06.01.2007 Added Discussion Topics in 'Introduction' Section
  • 05.31.2007 Added 'Vista Compatibility' Section
  • 05.31.2007 Added Reference to Intel System Programming Manual
  • 05.30.2007 Added 'Polling versus Notification' Section
  • 05.30.2007 Added Reference to Hacker Disassembling Uncovered
  • 05.29.2007 Changed Archive Extension from .zip to .gz
  • 05.29.2007 Added Interpreter QuickEdit Tip
  • 05.29.2007 Added Reference to Ezells' Self-Repairing Apps
  • 05.29.2007 Added Explanations of Global and Local Variable Initialization
  • 05.28.2007 Reworked 'Self Healing 1' Section
  • 05.28.2007 Added Explanation for ImageImformation()
  • 05.28.2007 Added 'Self Healing Software' Section
  • 05.28.2007 Added Note on WinDbg 'ba' Command from Johnson
  • 05.28.2007 Added Examination of Back Patching (with Disassembly)
  • 05.28.2007 Added Note on Hardware Debug Registers from Starodumov
  • 05.28.2007 Original Release

Checksums

  • SelfHealing1.zip
    • MD5: 7A351D99FE8C43DAE69E86BC91E7C156
    • SHA-1: 7E29AE5A2885A177A40A8FFCDEE61DF7C281A39B
  • SelfHealing2.zip
    • MD5: 0A4B07E14F6D9C793390B8535A6B0D0C
    • SHA-1: 2F29D87B2A13AD9AB73882F1366D101382B4F8B3
  • SelfHealing3.zip
    • MD5: B8D600316115EAA35C706868F7E782AA
    • SHA-1: 6C97B50102DBCACE3687F72BB41F84936667C460
  • SelfHealing4.zip
    • MD5: CA07633F5A015A31B8B1A54F8D4D95EE
    • SHA-1: 4EBB021699095A4A13FAAFABB41ACB5FD317E107
  • SelfHealing5.zip
    • MD5: 7D829223D51F3D10BDF41D3922B8DC6C
    • SHA-1: 888F21D9FCC086265AF4338D59D6C6E2810EFBF4
  • SelfHealing6.zip
    • MD5: 5B675B51DB28C78748B5E483F48EE213
    • SHA-1: 326B6A1F4DECB75474DDDECF9EA146090997EF08
  • SelfHealing7.zip
    • MD5: A9EDE7E9082380FDED436658D7B6E32D
    • SHA-1: 8CE27BBA9E7C6999CCEEF402154657FEE7A151AD
  • RelExe.zip
    • MD5: D54BC40AD94F316414D1239E637EB82E
    • SHA-1: 06F0C78AB952C1E0B15AC4F339CFE377CB1C592E

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)