Fun project. I was inspired to indulge my asm/minimalism fetish with a slightly different approach. Rather than the shellcode as data, copy an assembler routine to the child process. This yields some minor advantages...
Just 40 bytes of machine-code.
Typical function prolog code isn't needed as there's nothing to preserve.
Typical function epilog code isn't needed as ExitProcess never returns.
Pointer-to-the-data patching isn't required because the code itself determines where in memory the _SELFDEL data is.
I forgot to mention... this works fine in a 32bit app on a 64bit OS. I went with notepad as my proxy because under WOW64, without specifying the path, CreateProcess will create the 32bit notepad. Besides, I'm guessing the overhead for creating a notepad process is lower than explorer... or maybe not, as explorer should already be instanced.
Actually, launching "explorer.exe" from a 32-bit app seems to result in the 32-bit version of "explorer" being launched. As for overhead, yeah, notepad is smaller but explorer is around a lot - so.. YMMV.
Sweet! Works like a charm! Very neat. Nice call/pop trick for finding the address of the current instruction. One question though - how did you figure out the size of the injected code (40 bytes) and the offset to the SELFDEL instance (0x23 from whatever's in ebp)? Thanks!
Have you seen the FILE_FLAG_DELETE_ON_CLOSE trick *really* work? By that I mean not only does the targeted exe get deleted, but the proxy also. All the code I've seen copies the proxy to a temp folder and then executes it... deleting the target but the proxy remains. I've yet to see a setup where both get deleted. I'm starting to think this trick is a case for snopes.
Yep. You're right. The FILE_FLAG_DELETE_ON_CLOSE technique doesn't seem to work. I put together a quick sample and the proxy EXE didn't go away automatically. I also tried the sample from here: http://www.catch22.net/tuts/selfdel[^] and had the same result. Looks like FILE_FLAG_DELETE_ON_CLOSE causes the file to be deleted only if the last handle to be closed on that file happens to be the one which was opened with FILE_FLAG_DELETE_ON_CLOSE set. In our case this proves to be pretty useless as the handle on the proxy executable which was opened with FILE_FLAG_DELETE_ON_CLOSE set will certainly not be the last handle to be closed on that file.
Maybe I'll have to update the conclusion section of the article with this new bit of information! Thanks!
Looks slick, I'm trying to accomplish the same thing, but in a 64 bit application. I've seen many examples of using the stack to unwind with parameters to do the work, but in x64 assembler, that won't work due to function parameters in registers. I'm going to try fiddling with your sample but convert to 64 bit assembler and give it a go.
As I started doing a 64 bit conversion, it occurred to me to use a faster approach, and it works for both 32 and 64 bit flavors of Windows. Start with this 32 bit-only approach, but instead of embedding the code into the target application/process, create a binary resource from it. Then, any 32 or 64 bit host application need only extract and write the resource, call CreateProcess and pass the path of target exe to delete, then exit. The helper app then deletes the parent, and then the helper executes its own 32-bit self-destruct, and poof, all images are gone.
// Take a snapshot of all running threads
hThreadSnap = CreateToolhelp32Snapshot( TH32CS_SNAPTHREAD, 0 );
if( hThreadSnap == INVALID_HANDLE_VALUE )
printf("error 1: %d \n", GetLastError());
return( FALSE );
// Fill in the size of the structure before using it.
te32.dwSize = sizeof(THREADENTRY32 );
// Retrieve information about the first thread,
// and exit if unsuccessful
if( !Thread32First( hThreadSnap, &te32 ) )
printf("error 2: %d \n", GetLastError()); // Show cause of failure
CloseHandle( hThreadSnap ); // Must clean up the
// snapshot object!
return( FALSE );
// Now walk the thread list of the system,
// and display information about each thread
// associated with the specified process
//if( te32.th32OwnerProcessID == dwOwnerPID )
// printf( "\n\n THREAD ID = 0x%08X",
// te32.th32ThreadID );
// printf( "\n base priority = %d", te32.tpBasePri );
// printf( "\n delta priority = %d", te32.tpDeltaPri );
// printf("\n---- tid = %d, pid=%d ---- ", te32.th32ThreadID, te32.th32OwnerProcessID);