Click here to Skip to main content
14,732,665 members
Articles » General Programming » DLLs & Assemblies » General
Posted 16 Jun 2011

Tagged as


10 bookmarked

64 Bit Injection Cave

Rate me:
Please Sign up or sign in to vote.
4.79/5 (7 votes)
19 Jun 2011CPOL
Code injection cave for 64 bit processes


The software I am developing uses the excellent Injection cave code by Darawk,
taken from:

This is great for 32 bit, but I needed it to run on 64 bit as well. Since the solution uses inline assembly which is not supported in Visual Studio for 64 bit,
I had to find another way to do it. After searching far and wide for a 64 bit injection cave, I ended up writing it myself.

The Compiled Code for 64 bit

In Darawk's code mentioned above, the code to be injected was written in inline assembly. The name of the function was then used as the pointer from which to copy the compiled code at runtime.

The problem is that Visual Studio for 64 bit has no support for inline assembly. Therefore Darawk's code cannot be used as is. The solution I chose was to produce 64 bit machine code and include it in a hard-coded array in my code.

To achieve this, I took Darawk’s assembly code and compiled it with ml64.

As expected, it does not compile as is, so I ported the code to MASM64.

There are several differences that had to be incorporated here:

  1. MASM64 uses fastcall, so the function's argument has to be passed in a register and not on the stack.
  2. The length of the addresses - 32 vs. 64 bit - must be taken into account.
  3. MASM64 has no instruction that pushes all registers on the stack (like pushad in 32bit) so this had to be done by pushing all the registers explicitly.

Once the 64 bit assembly compiled successfully with ml64, I put the resulting machine code into an array, and injected the array itself into the target process.

Using the Code

Following is the injection function with the machine code array it injects.

Note that Darawk's 32 bit code used VirtualProtect to protect the injected code while writing to it, since it is in the code segment. In our case, the injected code
is on the heap. You should consider running the injection function under lock to prevent clashes in case it can be run from more than one thread at a time.

unsigned char codeToInject[] =
    // Placeholder for the return address
    0x68, 0xAA, 0xAA, 0xAA, 0xAA,        // push 0AAAAAAAAh
    // Save the flags
    0x9c,                    // pushfq                
    // Save the registers
    0x50,                    // push rax
    0x51,                    // push rcx
    0x52,                    // push rdx
    0x53,                    // push rbx
    0x55,                    // push rbp
    0x56,                    // push rsi
    0x57,                    // push rdi
    0x41, 0x50,                // push r8
    0x41, 0x51,                // push r9
    0x41, 0x52,                // push r10
    0x41, 0x53,                // push r11
    0x41, 0x54,                // push r12
    0x41, 0x55,                // push r13
    0x41, 0x56,                // push r14
    0x41, 0x57,                // push r15
    // Placeholder for the string address and LoadLibrary
    0x48, 0xB9, 0xBB, 0xBB, 0xBB, 0xBB, 0xBB, 0xBB, 0xBB, 0xBB, // mov rcx, 0BBBBBBBBBBBBBBBBh
    0x48, 0xB8, 0xCC, 0xCC, 0xCC, 0xCC, 0xCC, 0xCC, 0xCC, 0xCC, // mov rax, 0CCCCCCCCCCCCCCCCh
    // Call LoadLibrary with the string parameter
    0xFF, 0xD0,                // call rax
    // Restore the registers
    0x41, 0x5F,                // pop r15
    0x41, 0x5E,                // pop r14
    0x41, 0x5D,                // pop r13
    0x41, 0x5C,                // pop r12
    0x41, 0x5B,                // pop r11
    0x41, 0x5A,                // pop r10
    0x41, 0x59,                // pop r9
    0x41, 0x58,                // pop r8
    0x5F,                    // pop rdi
    0x5E,                    // pop rsi
    0x5D,                    // pop rbp
    0x5B,                    // pop rbx
    0x5A,                    // pop rdx
    0x59,                    // pop rcx
    0x58,                    // pop rax
    // Restore the flags
    0x9D,                    // popfq
    0xC3                    // ret
int WINAPI inject_lib_cave( HANDLE hProcess, HANDLE hThread, const char* lib_name )
    void*            dllString;
    void*            stub;
    DWORD64             stubLen, loadLibAddr;
    DWORD             oldIP;
    CONTEXT          ctx;
    BOOL             result = FALSE;
    DWORD            suspend_result = -1;
    stubLen = sizeof( codeToInject );
    loadLibAddr = (DWORD64)GetProcAddress( GetModuleHandleA("Kernel32"), 
                   "LoadLibraryA" );
    dllString = VirtualAllocEx(hProcess, NULL, (strlen(lib_name) + 1), 
                                   MEM_COMMIT, PAGE_READWRITE);
    stub      = VirtualAllocEx(hProcess, NULL, stubLen, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
    if( dllString == NULL || stub == NULL )
        if(dllString != NULL) free( dllString );
        if(stub != NULL) free( stub );
        MessageBoxA( NULL, "Virtual Alloc failed.", "My Msg", MB_OK );
        goto clean_exit;
    result = WriteProcessMemory(hProcess, dllString, lib_name, strlen(lib_name), NULL);
    if ( !result )
        MessageBoxA( NULL, "Could not write process memory for dllString.", 
                     "My Msg", MB_OK );
        goto clean_exit;
    suspend_result = SuspendThread( hThread );
    if ( suspend_result == -1 )
        MessageBoxA( NULL, "Could not suspend thread.", 
                     "My Msg", MB_OK );
        goto clean_exit;
    ctx.ContextFlags = CONTEXT_CONTROL;
    GetThreadContext(hThread, &ctx);
    oldIP   = (DWORD)ctx.Rip;
    ctx.Rip = (DWORD)stub;
    ctx.ContextFlags = CONTEXT_CONTROL;

     * Insert the addresses into the local copy of the codeToInject before copying it to
     * the remote process 
    memcpy( codeToInject + 1, &oldIP, sizeof( oldIP ) );
    memcpy( codeToInject + 31, &dllString, sizeof( dllString ) );
    memcpy( codeToInject + 41, &loadLibAddr, sizeof( loadLibAddr ) );
    result = WriteProcessMemory(hProcess, stub, codeToInject, stubLen, NULL);
    if ( !result )
        MessageBoxA( NULL, "Could not write process memory.", 
                     "My Msg", MB_OK );
        goto clean_exit;

    result = SetThreadContext(hThread, &ctx);
    if ( !result )
        MessageBoxA( NULL, "Could not set thread context.", 
                     "My Msg", MB_OK );
        goto clean_exit;
    if ( suspend_result > -1 )
        suspend_result = ResumeThread( hThread );
        if ( suspend_result == -1 )
            MessageBoxA( NULL, "Could not resume thread.", 
                         "My Msg", MB_OK );
    return result;


This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


About the Author

Rimon Orni
Software Developer Concealium IT Security Ltd.
Israel Israel
I am an experienced Software Developer working mainly in C++ and C. Worked mostly in Security - previously at Check Point and today at Concealium on Data Security in the Cloud.
These days developing mostly for Windows, but always happy to go back to good old Unix.

Comments and Discussions

QuestionFixed shellcode Pin
JeanneKamikaze12-Sep-15 6:06
MemberJeanneKamikaze12-Sep-15 6:06 

Here is the assembly code that worked for me:
push dword 0x17171717 ; ret (high)
push dword 0x17171717 ; ret (low)
; no pushad on x64
push rax
push rbx
push rcx
push rdx
push rsi
push rdi
push rbp
push r8
push r9
push r10
push r11
push r12
push r13
push r14
push r15
push dword 0x23232323 ; align stack
mov rcx, 0x1818181818181818 ; dll path, using fastcall
mov rax, 0x1919191919191919 ; LoadLibrary
call rax
pop rax
pop r15
pop r14
pop r13
pop r12
pop r11
pop r10
pop r9
pop r8
pop rbp
pop rdi
pop rsi
pop rdx
pop rcx
pop rbx
pop rax

And the corresponding shellcode:
char load_dll_64[] =

Several comments:

- The rip register / return address is 8 bytes, not 4. Given that there is no 'push qword' instruction in x64, I push two dwords instead, first the high part and then the low part (so that the low part appears in a lower memory address). The two 0x17171717 values in the assembly code are the placeholders for the value of rip. Because the two values are separated by a 0x68 in the shellcode (0x68 = push), you need to copy the value of rip with two calls to memcpy().

- Notice the 'push dword 0x23232323' after pushing all of the registers. This is because the stack must be 16-byte aligned on x64 as described in the Calling Conventions section here. Note that the value 0x23232323 need not be patched. The point is pushing 4 bytes onto the stack to properly align it.

- Similarly, notice the 'pop rax' instruction after calling LoadLibrary. This is to pop the dummy 0x23232323 value we previously pushed. The use of rax here is irrelevant, the point is popping the dummy value from the stack.

I got my code cave working on x64 now. Hope that helps Smile | :)
QuestionRet address Pin
Brian Sullender19-May-13 6:18
MemberBrian Sullender19-May-13 6:18 
AnswerRe: Ret address Pin
JeanneKamikaze11-Sep-15 12:40
MemberJeanneKamikaze11-Sep-15 12:40 
QuestionGreat article but it didn't work for me Pin
Omerikoo7-Mar-12 9:09
MemberOmerikoo7-Mar-12 9:09 
AnswerRe: Great article but it didn't work for me Pin
Rimon Orni23-May-12 22:06
MemberRimon Orni23-May-12 22:06 
GeneralReason for my vote of 5 Great tip. Please stay in touch. Pin
Michael Haephrati1-Feb-12 10:34
professionalMichael Haephrati1-Feb-12 10:34 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.