Taking advantage of Windows Hot Patching mechanism

Kewin Rausch

Rate me:

4.93/5 (19 votes)

1 Apr 2014CPOL14 min read

34.6K

828

Here I will show how to emulate Windows Hot Patching and how to use this mechanism to redirect the execution of a custom procedure, in a temporary or permanent way.

Download WinPatching solution files - 9.5 MB

Introduction

Maybe not all programmers know that Microsoft developed a mechanism to allow the Hot Patching of his libraries and executables with Windows OS family. Hot Patching allows to update running components without the restarting of the whole system; of course critical components(usually system components) always needs to restart the whole OS before applies the necessary changes.

The scope of this article is to explain how an assembly is designed to allow easy hot-patching, and how to use such design to allow to redirect the execution in a custom procedure, which can be seen as the patch. I won't describe how Windows performs Hot Patching because information about the implementation of such mechanism seems to be an internal, undocumented, part of the OS.

Background

I assume that the reader has basic and intermediate knowledge over Assembly programming language, C/C++ coding syntax, Windows API usage and calling convention, Intel x86/64 operation codes, structure of a Portable Executable(PE) file and debugging of processes. I also assume that you're confident with Visual C++ IDE, IDA Pro free Debugging tools and your favourite Hex viewer(I use HxD).

However I will try to write the article in such a way that also someone who's not an expert in such field could read and understant how the mechanism works.

How it's made: Windows Hot Patching

The most important thing to say is that Windows programmers built the components of the system to allow it's Hot Patching. Without this initial effort it will be more difficult to hot patch a component, and it will usually leads to the rewrite of the inital part of the procedure.

As you all know(I swear) the Windows API procedures uses the __stdcall calling convention(search for __stdcall in MSDN), which describes how the procedure will be called. This option allows the callee to clean the stack before executing the inner code(remember also that __stdcall will return the procedure value in eax processor registry, just like __cdecl convention). In brief the compiler will arrange the function for you(I assume you're coding in C/C++) in such way:

ASM

// __stdcall format.
// Prologue: clean the stack to prepare for execution.
//
push ebp
mov  ebp, esp
//
// Here goes the procedure codes; at the end return the value 0.
//
mov  eax, 0
//
// Epilogue: restore the previously saved base pointer.
//
mov  esp, ebp
push ebp
ret

In addition of this calling convention, Windows programmers wisely add some instruction to "make space" for a short jump(processor jump to a location in memory which is distant from -128 to 127 bytes from the operation, see Intel manuals for x86/64), which allows to redirect the execution in another, patched, procedure. The instructions used to reserve such space usually are translated with the assembly language statement "mov edi, edi", which can be translated in opcode as the word(16 bits) "8b ff".

If you use IDA Pro (the version 5.0 is free to download), you can take a look at the exported procedures(Export tab), and you can realize that the most of them have such apparently useless statement at the start of the procedure, before __stdcall convention prologue.

In addition of such statement, in front of each "patchable" procedure, Windows programmer reserved 5 bytes of free space(which can differs from library to library, in ntdll.dll there are 5 0xcc bytes while in user32.dll there are 5 0x90 bytes). These bytes are enough to store a far jump to a location in the memory(far jump allows to jump to an arbitrary position distant from -2147483648 to 2147483647 from the operation).

ASM

// Patchable procedure as you can find in ntdll.dll.
// The opcode 0xcc(int 3) is usually a value filled by the compiler to align the code.
//
int  3
int  3
int  3
int  3
int  3
mov  edi, edi
push ebp
mov  ebp, esp
//
// ... and the procedure continues here.

To resume all what I said, every patchable procedure is preceeded by a pattern of 7 bytes that allows to redirect the execution in another existing or injected procedure.

Core: overwrite and jump away.

As I said in the title, the idea between the hotpatch pattern is the rewrite of the 7 bytes of code which are in front of the procedure to jump in a custom place in memory. Unfortunately, every change of the memory will cause the OS to create a copy of the page in memory, so any patching done within a particular module will be valid only for that process. This mechanism is called copy-on-write and it's part of the OS memory protection(search Copy-on-Write Protection on MSDN for more info); this protects processes to rewrite part of their code which is also used by other module in the system.

The only way to allows every module to use the injected patch is to perform such patching directly in the physical file in the file system, so every application which will use that module from the time of change on will contains such a patch. With my project is possible to choose whenever you want to patch in memory of in file.

Perform a full hotpatch of a component currently running on the system(change the code in the physical memory pages), which leads in a update for all the processes in the system, is beyond the scope of this article(and usually requires kernel mode components, which have access to physical pages).

My naive mechanism follows these steps:

Find the target procedure you want to patch(this can be done with various heuristic; in my project I patch the first compatible procedure) using the known pattern to find it.
Find enough free space in the same module to contain the patch. In addition this step can be bind with your custom heuristic; in my project I pick the first compatible portion of free space.
Compute the difference between the injected procedure and the target procedure: this will be the value of the far jump.
Inject the jumps in front of the target procedure and injects the patch in the free space found.

Scanning

Step 1 and 2 are performed through the scanning of the memory in search of compatible patterns.

In order to not cause any problem with the Windows subsystem I included in the solution a DLL project that I designed to be similar to ntdll.dll patching pattern. This little project only contains a code sheet which have some "nacked" procedure which I arranged though Assembly statements.
The arrangement is a copy od __stdcall calling convention with the add of the 7 bytes used for the hotpatching mechanism and a short jump which allows it's normal execution. Without such jump the processor will execute interrupt 3 once someone call the procedure, and this will lead to a debug break(this is how is managed such opcode by Intel).

ASM

// Procedure exported by dummy.dll
DUMMY_API __declspec(naked) int 
dummyA()
{
    // Entry to emulate windows patching mechanism.
    __asm
    {
        jmp     short start
        int     3
        int     3
        int     3
        int     3
        int     3
start:  mov     edi, edi
        push    ebp
        mov     ebp, esp
    }

The scanning procedures presents in the mem.c code sheet will simply scan the given buffers of bytes in search for the pattern to recognize. The buffer can be filled by a fread() procedure, if it's the case of file patching, or simply by a memcpy() procedure, if the desired patch will be only injected in memory. Both the scanning procedures will continue to scan the entire file and will fill a list of descriptor which tells us where are the patchable candidates(mem.h contains the structure which describes each candidate).

The pattern for the patchable procedure, in this case, will be the byte sequence(7 bytes):
cc cc cc cc cc 8b ff 55 8b ec

Scanning for this pattern alone can be dangerous, because you can find some compatible bytes that are, for example, in the data area. This simply scan is only done here for simplicity purpose; it's better limit such scanning in the only area dedicated to execution, or use the export procedures section in the PE file as a guide(in this section there are listed the usuable procedure exported by the module, so you will be sure that they are really a procedure).

The pattern for the free memory area compatible with injection will be the byte sequence(14 bytes of alignment reserved space; usually will be near the patchable candidate):
cc cc cc cc cc cc cc cc cc cc cc cc cc cc

The free memory patter has been choosen like that because the patch I will insert it's a 14 bytes procedure which return the value 0x64('D' in the ASCII table). Such procedure is a copy of the dummyD() procedure which is present in the dummy DLL project. dummyD() is an unused procedure inserted in the project only to allow me to open the binary with IDA and copy the desired portion of the code.

Compute jumps

Step 3 is quite easy and mathematical section. The interesting jumps needed to perform such patching are two: one is the near jump that leads the execution 5 bytes before the instruction, and the second is the far jump.

The first jump is always fixed, because the jump back does not change. We assumed(and saw) that the 5 bytes necessary for the second jump are located always at the begin of the patchable procedure. It's also possible to change this jump to select another free section, but remember that you're limited in the range of -128 and 127 addresses away.

The second jump can be calculated as follows:
Far Jump = <address of the injected patch> - <address of the target procedure> - 5

The addresses needed are obtained by the previous scans(steps 1 and 2) while the constant 5 is the size of the jump instruction(which is consumed by the CPU during execution).

Inject patch

The final step finally writes the modification in memory/file. The job to do is quite simply now that all the necessary base step has been computed: overwrite the memory to change the execution flow. In my solution I implemented both(memory and file overwriting) which you can test on the dummy library.

The actions for both the procedures is to write in order the modifications one after the other. It begins first with the far jump instruction 0xe9 and append immediately after 4 bytes that are the offset to the injected procedure. Then overwrites the "mov edi,edi" instruction located before the __stdcall prologue with the bytes 0xeb 0xf9. The first one is the opcode for a short jump while the second byte is the negative offset necessary to redirect the flow to the far jump.

After it "seeks" to the area with the free space scanned in the previous steps and overwrites the bytes following with the prepared patch. In this case, as I said, the patch injected is the dummyD() procedure which is translated by the compiler in the following bytes:
8b ff 55 8b ec b8 64 00 00 00 8b e5 5d c3

Here following the procedure to inject the patch into a file:

C++

int
injectFilePatch(
    char * path,
    dword procBase,
    dword patchBase,
    byte * patch,   
    dword pLength)
{
    int    bWr = 0;
    FILE * fd  = fopen(path, "r+b");
 
    byte farJmp     = JMP_FAR;
    // 5 is the jmp instruction size(e9 xx xx xx xx).
    dword jmpOff    = patchBase - procBase - 5;
 
    if (!fd)
    {
        return ERR_IO;
    }
 
    // Positionate at the begin of the area to rewrite.
    fseek(fd, procBase, SEEK_SET);
 
    // Writes the far jum instruction ...
    bWr += fwrite(&farJmp, 1, 1, fd);
    // ... to this address ...
    bWr += fwrite(&jmpOff, 1, 4, fd);
    // ... and replace mov edi,edi with a short jump back.
    bWr += fwrite(g_short_jmp_back, 1, 2, fd);
 
    if (bWr != 7)
    {
        return ERR_IO;
    }
 
    // Positionate at the begin of the area to rewrite.
    fseek(fd, patchBase, SEEK_SET);
 
    // Inject the patch in the binary file.
    bWr = fwrite(patch, 1, pLength, fd);
 
    if (bWr != pLength)
    {
        return ERR_IO;
    }
 
    fclose(fd);
 
    return SUCCESS;
}

Here the procedure to inject the patch in memory:

C++

int
injectMemPatch(
    dword procBase,
    dword patchBase,
    byte * patch,
    dword pLength)
{
    int    bWr = 0;
 
    byte farJmp = JMP_FAR;
    // 5 is the jmp instruction size(e9 xx xx xx xx).
    dword jmpOff = patchBase - procBase - 5;
 
    // Writes the far jum instruction ...
    //bWr += fwrite(&farJmp, 1, 1, fd);
    memcpy(procBase, &farJmp, 1);
    // ... to this address ...
    //bWr += fwrite(&jmpOff, 1, 4, fd);
    memcpy(procBase + 1, &jmpOff, 4);
    // ... and replace mov edi,edi with a short jump back.
    //bWr += fwrite(g_short_jmp_back, 1, 2, fd);
    memcpy(procBase + 5, g_short_jmp_back, 2);
    
    // Inject the patch in the binary file.
    //bWr = fwrite(patch, 1, pLength, fd);
    memcpy(patchBase, patch, pLength);
 
    return SUCCESS;
}

Note: Before trying to write the memory, I had to change the protection level of the memory of the process to have the necessary rights to modify the area. This can be done with the Windows API procedure VirtualProtect(), which is present into the scanning procedure in code sheet mem.c. Without such authorization, any attempt to write the memory will result in an exception which will break the execution of your process.

Testing

Attached to this article I put a compressed archive that contains a Visual Studio 2013 for Windows Desktop solution which contains three projects:

The patcher: this program is in charge to apply the patch.
Dummy: a dummy dll formatted to have a similar patching signature as ntdll.dll.
Test: this program loads the dll and executes the exported procedures.

The only project you need to modify in order to make tests is the Patcher project. As a default action, this program just exit when his entry point is executed. Is up to you to uncomment the desired patching action you want to do, and change of course the path to the dummy.dll module.

C++

int main(int argc, char ** argv)
{
    // Uncomment this to patch the physical file; the patch is permanent.
    //patchFile("C:\\Dev\\WinPatching\\Debug\\dummy.dll");
    // Uncomment this to patch in memory; the patch is valid only for patcher.exe.
    //patchMemory("C:\\Dev\\WinPatching\\Debug\\dummy.dll");

    return 0;
}

Using a command prompt you can test the permanent patching running first test.exe, then the patcher.exe (after you removed the comment to patchFile() procedure), and then running test.exe again.

Here some output of the test:

C++

Dummy library tester [version 0.1]
Initial state is: a=0x0, b=0x1, c=0x2
Calling dummyA...
a should be 0x61, a=0x61
Calling dummyB...
a should be 0x62, a=0x62
Calling dummyC...
a should be 0x63, a=0x63
Final state is: a=0x63, b=0x1, c=0x3

Scan pattern:
cc cc cc cc cc 8b ff 55 8b ec 
Candidates found:
Base: 0x432
Base: 0x452
Base: 0x472
Patching file procedure at 0x432 to 0x49e

Dummy library tester [version 0.1]
Initial state is: a=0x0, b=0x1, c=0x2
Calling dummyA...
a should be 0x61, a=0x64
Calling dummyB...
a should be 0x62, a=0x62
Calling dummyC...
a should be 0x63, a=0x63
Final state is: a=0x63, b=0x1, c=0x3

As you can see, the first text block is the result of running test.exe for the first time. The initial state is given on the secon line, with the value of the various variables used. The variable "a" will store the result of the procedures every time a dummy.dll procedure is called: as you can see the expected value is always obtained.

After initial test.exe execution I executed patcher.exe(after having uncomment the patchFile() procedure). The scan patter remember us what signature are we searching for, while the candidates provides the base address of the dummyA(), dummyB() and dummyC() procedures in the physical file. As you can notice, there's no candidate for dummyD() procedure; this is because dummyD() has been written without the patching signature.
The last line tell us which candidate has been chosen(always the first one, for easy testing purpose) and where the execution will be redirected. If you open the file with an Hex editor before and after the patch you'll easily see the rewritten bytes.

The last block is the log again of test.exe; this time the procedure dummyA() will return 0x64, as we wanted to. This is due the patch injected into the dll itself. From now on, every process which will load and use dummy.dll will return 0x64 from calling that exported procedure.

The testing of the memory patching is a little more complicated. First uncomment the right procedure(patchMemory) in patcher project, and delete the modified dll from the Debug folder(so it will be virgin again).
Using Visual Studio inject a code break (F9 key) where the injection is called, in patcher.c:

C++

status = injectMemPatch(
       matchBase,
       patchBase,
       g_patch,
       PATCH_LENGTH);

This will break the execution immediately before the patch injection.

Now, if you Debug the patcher application(F5 key) the execution will stops there and you'll see in the console created the same output as you saw in the file patching test. This time the base addresses are different(bigger) because they refer to memory locations. Using the Memory Window of Visual Studio(Debug > Windows > Memory > Memory 1) you can open another tab which will show what's happening in the RAM right now. Moving to the location pointer(in my case was 0x6ace1032) you will find where the loader loaded your executable:

0x6ACE1000  cc cc cc cc cc e9 26 00 00 00 e9 41 00 00 00 e9  ÌÌÌÌÌé&...éA...é
0x6ACE1010  5c 00 00 00 e9 77 00 00 00 cc cc cc cc cc cc cc  \...éw...ÌÌÌÌÌÌÌ
0x6ACE1020  cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc  ÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌ
0x6ACE1030  eb 05 cc cc cc cc cc 8b ff 55 8b ec b8 61 00 00  ë.ÌÌÌÌÌ.ÿU.ì¸a..
0x6ACE1040  00 8b e5 5d c3 cc cc cc cc cc cc cc cc cc cc cc  ..å]ÃÌÌÌÌÌÌÌÌÌÌÌ
0x6ACE1050  eb 05 cc cc cc cc cc 8b ff 55 8b ec b8 62 00 00  ë.ÌÌÌÌÌ.ÿU.ì¸b..
0x6ACE1060  00 8b e5 5d c3 cc cc cc cc cc cc cc cc cc cc cc  ..å]ÃÌÌÌÌÌÌÌÌÌÌÌ
0x6ACE1070  eb 05 cc cc cc cc cc 8b ff 55 8b ec 8b 45 0c 8b  ë.ÌÌÌÌÌ.ÿU.ì.E..
0x6ACE1080  4d 08 80 c1 02 88 08 b8 63 00 00 00 8b e5 5d c3  M.€Á.ˆ.¸c....å]Ã
0x6ACE1090  8b ff 55 8b ec b8 64 00 00 00 8b e5 5d c3 cc cc  .ÿU.ì¸d....å]ÃÌÌ
0x6ACE10A0  cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc  ÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌ
0x6ACE10B0  cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc  ÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌ
0x6ACE10C0  cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc  ÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌ
0x6ACE10D0  cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc  ÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌ
0x6ACE10E0  cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc  ÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌ
0x6ACE10F0  cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc  ÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌ

If you look carefully at 0x6ace1032 you'll recognize the patter we were looking for: cc cc cc cc cc 8b ff 55 8b ec. If you continue with the assembly analysis you'll also detect the other dummy procedures; they all starts with the same prologue(eb 05 cc cc cc cc cc 8b ff 55 8b ec) and they're aligned with "0xcc"'s to start at the begin of the line. The only different procedure is dummyD, the last one in this block, which has an unpatchable prologue and starts at 0x6ace1090.

If you maintain the Memory 1 tab open and proceed with the execution by one step(F10) you'll see the modification of the memory in real time. Usually here Visual Studio shows the red characters as the overwritten ones(I enhanced such areas with bold characters).

0x6ACE1000  cc cc cc cc cc e9 26 00 00 00 e9 41 00 00 00 e9  ÌÌÌÌÌé&...éA...é
0x6ACE1010  5c 00 00 00 e9 77 00 00 00 cc cc cc cc cc cc cc  \...éw...ÌÌÌÌÌÌÌ
0x6ACE1020  cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc  ÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌ
0x6ACE1030  eb 05 e9 67 00 00 00 eb f9 55 8b ec b8 61 00 00  ë.ég...ëùU.ì¸a..
0x6ACE1040  00 8b e5 5d c3 cc cc cc cc cc cc cc cc cc cc cc  ..å]ÃÌÌÌÌÌÌÌÌÌÌÌ
0x6ACE1050  eb 05 cc cc cc cc cc 8b ff 55 8b ec b8 62 00 00  ë.ÌÌÌÌÌ.ÿU.ì¸b..
0x6ACE1060  00 8b e5 5d c3 cc cc cc cc cc cc cc cc cc cc cc  ..å]ÃÌÌÌÌÌÌÌÌÌÌÌ
0x6ACE1070  eb 05 cc cc cc cc cc 8b ff 55 8b ec 8b 45 0c 8b  ë.ÌÌÌÌÌ.ÿU.ì.E..
0x6ACE1080  4d 08 80 c1 02 88 08 b8 63 00 00 00 8b e5 5d c3  M.€Á.ˆ.¸c....å]Ã
0x6ACE1090  8b ff 55 8b ec b8 64 00 00 00 8b e5 5d c3 8b ff  .ÿU.ì¸d....å]Ã.ÿ
0x6ACE10A0  55 8b ec b8 64 00 00 00 8b e5 5d c3 cc cc cc cc  U.ì¸d....å]ÃÌÌÌÌ
0x6ACE10B0  cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc  ÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌ
0x6ACE10C0  cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc  ÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌ
0x6ACE10D0  cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc  ÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌ
0x6ACE10E0  cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc  ÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌ
0x6ACE10F0  cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc  ÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌ

Now the dummy.dll module loaded in the memory is patched for this process, and any call at these procedures will end with the return value of 0x64.

Conclusions

Hot patching or permanent patching can be a useful, painless, solution there where the assembly design allows it, such in Windows libraries. This mechanism allows redirecting the execution flow to your custom procedure and can be used in many way to personalize or improve some services on the fly. A mail or web server, for example, can update himself without a system reboot. Antivirus products can redirect the execution from viral procedures into the sensible area of the system, or even interrupt the malware by jumping away from it's code area.

Patching for procedure which does not contains any design to help the redirection is still possible, but usually leads to the overwrite of the first bytes of the procedure.

The projects code is clean and fully commented, so it won't be difficult to follow the flow and understand every procedure and action taken. If you have any question, just aske me and i'll be happy to answer as soon as possible.

This is my first article here on Codeproject, and I hope it will be usefull to someone, at least to understand a little how this mechanism work and how to take advantage over it. Please excuse me if you find any grammatical or technical error, and feel free to contact me: I'll be happy to modify the article to correct the reported sentence.

History

03/07/2014 - Initial release of the article. :3
04/01/2014 - Corrected "asp" to "esp" in the first assembly code snippet in the article(grammar mystake, my bad).

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Written By

Kewin Rausch

Software Developer

Italy

I'm a Software Engineer with deep and extended knowledge in Computer Networks and low level environments as kernels and drivers. My core competencies are C/C++ and Assembly programming languages, in both user- and kernel-space, together with an problem-solving oriented mindset and huge imagination to develop alternative approach to deal with them.

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.