Click here to Skip to main content
Click here to Skip to main content

API Hooking Revealed Part 2 - Useful tips

By , 10 Mar 2005
 

Introduction

This is the second part for building a thread deadlock detector. Please see the first article to understand what is going on: A (working) implementation of API hooking (Part I). The next part (with a working deadlock detector) is also here: Thread Deadlock detector

The API hooked functions for a thread deadlock detector

What should be intercepted for the purpose of detecting a deadlock:

  • Thread functions (Create[Remote]Thread, [Suspend/Resume]Thread, ExitThread, TerminateThread, OpenThread).
  • Synchronization functions (WaitFor[Single/Multiple]Object[Ex], SignalObjectAndWait, [Set/Reset/Pulse]Event).
  • Synchronization objects creation (Create/Open[Mutex/Semaphore/Event], DuplicateHandle).
  • Synchronization objects deletion (CloseHandle).

To intercept the code, I simply added to the previous code the required functions. Please have a look at the previous article about how to add functions to be hooked. The main idea is to have a hook structure declared like:

typedef struct
{
    char        szDLLName[MAX_PATH]; // The DLL name
    char        szFuncName[MAX_PATH];// The function name
    void *        pNewFunc;          // The new function pointer
    void *        pPrevFunc;         // The previous function pointer
    Flags        flags;              // The flags (hooked or not, etc...)
} HookStruct;

When hooking the function in all modules, the previous (and true) function pointer is saved in pPrevFunc, while the new function pointer (set to our hooking function) replaces the module IAT.

Then in our function, we can simply call the previous function by converting the pPrevFunc pointer to the correct function's pointer signature. I defined a useful macro for that (the source code will be in part III).

#define CallFunction(X)    
((Signature_##X)GetPreviousFunctionAddress(Index_##X))
// with signature defined like this
#define Signature_CreateThread                   HANDLE (FAR PASCAL *)\
(LPSECURITY_ATTRIBUTES lpThreadAttributes, DWORD dwStackSize, \
LPTHREAD_START_ROUTINE lpStartAddress, LPVOID lpParameter, DWORD \
dwCreationFlags, LPDWORD lpThreadId)
// And index like this
#define Index_CreateThread             3

Okay, we can now call the good function, so let's start the real work.

The communication process

So, how do we inform the server that the "hooked" application is currently performing a monitored action? The simple answer is to create a structure to send to the server through a WM_COPYDATA message, with the needed information. The needed information is application-dependant, and in our case, it consists of:

  • The current thread handle (who is calling this function)..
  • The current thread ID (the handle is not enough, see below).
  • The manipulated object ID/handle (what are we touching).
  • An additional argument (if any).
  • The current command (what function are we calling).
  • The object name, if any (useful to debugging only).
  • The call stack (required to find from where the call came from).
  • The current timestamp (needed to check for the deadlocks).

The structure is then defined as:

typedef struct 
{
    void *            lAddress;   // The address pointer in the stack
    unsigned int    lFlags;       // Some flags
} StackPointer; 

typedef struct 
{ 
    HANDLE            hObjectID;  // The waiting object ID
    HANDLE            hThread;    // The current thread ID
    DWORD            dwThreadID;  // The current thread ID
    unsigned int    lState;       // The current state, count, etc...
    HANDLE            hFlag;      // The stack here
    Commands        Command;      // The current command
    char            sName[256];   // The object name (if any)
    LARGE_INTEGER   llTimestamp;  // The message timestamp
    StackPointer    xPointers[10];// The stack pointers 
                                  // (only 10 pointers are saved)
} CommunicationObject;
// with Commands being an enum like 
// CmdCreateThread = 0, CmdExitThread = 1, etc...

A hooking function will then look like:

// Declare the signature for waiting functions
THREADSPY_API HANDLE WINAPI MyCreateThread(LPSECURITY_ATTRIBUTES 
lpThreadAttributes, DWORD dwStackSize, LPTHREAD_START_ROUTINE lpStartAddress, 
LPVOID lpParameter, DWORD dwCreationFlags, LPDWORD lpThreadId)
{
    DWORD dwID;
    HANDLE hHandle =  CallFunction(CreateThread)(lpThreadAttributes, 
dwStackSize, lpStartAddress, lpParameter, dwCreationFlags, &dwID);
    if (hHandle != NULL)
    {
        CommunicationObject xObj;
        memset(&xObj, 0, sizeof(xObj));
                // Get the true current thread handle
        xObj.hThread   = GetTrueCurrentThread();
                // Get the manipulated handle (it is a thread id here)
        xObj.hObjectID = hHandle;
                // And other data
        xObj.lState    = dwID;
                // Save the caller thread ID
        xObj.dwThreadID = GetCurrentThreadId();
        if (lpThreadId != NULL) 
        {
            *lpThreadId = dwID;
        }
                // Save the current thread
        xObj.Command = CmdCreateThread;
                // Then send the structure (but fill it 
                // with timestamp and call stack before)
        Communicate(&xObj, sizeof(xObj));
                // Send another command if the thread is suspended
        if (dwCreationFlags & CREATE_SUSPENDED)
        {
            xObj.lState = 1;
            xObj.Command = CmdSuspendThread;
            CommunicateWithoutTime(&xObj, sizeof(xObj));
        }

        // We don't need the handle anymore
        if (xObj.hThread != NULL) CallFunction(CloseHandle)(xObj.hThread);
    }

    return hHandle;
}

The tricky part

Okay, now we simply have to fill in the structure. Even if it looks easy, it is not because of any problem due to shortage in Win32 API. For example, if you use the GetCurrentThread function, the Windows API will return a special handle CURRENT_THREAD_HANDLE. This information is, of course, not useful here. This is how Windows handles HANDLE. For the same kernel object, one can have multiple handles on it, on different memory space. So with a HANDLE it is not possible to uniquely identify a thread, we need its ID. While it is easy to store the thread handle and ID in a structure in any program, it is not obvious that the debuggee program will have such a mapping. That's why we need to find out how to get the thread ID given its HANDLE. For example, when a thread call TerminateThread to kill another thread is called, it only uses the killed thread handle, not its ID. The server will never be able to match which thread was killed (or this will require a kind of matching algorithm etc., etc...). Windows Server 2003 provides a function called GetThreadId to get the thread ID, but because it is only in Win2K3, it is not useful.

The solution to this issue is to use the NtQueryInformationThread hidden function from NTDLL.DLL (NTDLL.DLL is mapped in every process memory) like Visual Studio debugger. This function can return a CLIENT_ID structure with the thread ID in it. For more information, please see the Undocumented NT Internal.

Now that we can identify the object being manipulated, we need to get the stack trace. This can be done by using the StackWalk function from Win32 API. Usually, this function is used by stopping the debuggee thread, retrieving its context, and then resuming the thread. As we don't want to stop the thread (because it will change scheduling order while being debugged), we need to fill up a context structure by ourselves. The trick is to read the EIP register before using StackWalk, and this can be done easily with a few ASM commands like:

    CONTEXT c;

    _EnterCriticalSection(&mhSection);
    // This is a ugly code to get the stack trace
    __asm
    {
        call GetEIP
        GetEIP:
        pop eax
        mov c.Eip, eax
        mov c.Ebp, ebp
    };
        _LeaveCriticalSection(&mhSection);

The last trick needed is to detect when a thread has stopped. Because the injected DLL cannot create a thread without disturbing the program, we are supplying a function called CheckRunningThread that uses the same signature as a Thread Start routine, taking a thread handle in client memory space, and returning a DWORD for the thread state. By using the same method for injecting the DLL (thanks to CreateRemoteThread), the server can stop the process being debugged, create a remote thread starting on the CheckRunningThread function, and reading the thread status before resuming the debuggee. This way, the server can check when a thread finishes and detect for never-released objects.

To conclude

In this article there are answers to some impossible things (from MSDN) like:

  • Get a thread ID from its thread handle (Google around, and you'll see it is a real issue).
  • Get the stack trace of a running thread (again it is not a usual practice).
  • Get a real thread handle instead of the default generic value (using DuplicateHandle).
  • Spy when a thread in a remote process has finished.
  • Communicate with a server about any action.
  • Hook any Win32 API function.

The drawback is that it requires WinNT kernel (like XP, 2K and 2K3), but I'm sure it is not an issue nowadays.

This is for the client part. I will provide the source code for both the server and client in the next part (Part III). We will then see how to get the entire log of synchronization function call in a process, and how to map the call stack value to function names.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

xryl669

France France
No Biography provided

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
GeneralTestApp does not work for me on Windows XP SP2memberAnthony Brenelière27-Dec-05 11:25 
I use Windows XP SP2.
 
The demo does not work,
 
So I re-compiled the source code with Visual Studio 8.0, I also re-compiled the driver with Windows XP DDK.
 
Functions are not hooked on my computer. When I Lauch HookSrv.exe, TestApp displays 'Hello from TestApp' and not the text 'Do you recon this is the original text ?' as it should be.
 
Is ther another version ? Or fixes ?
 
Any help is welcome,
 
Cordially
 
Anthony Brenelière
GeneralRe: TestApp does not work for me on Windows XP SP2memberVitoto2-Jan-06 8:16 
Update you XP2 Package to Last Upgrade, maybe problem is in XP2 Security issues.
 

GeneralRe: TestApp does not work for me on Windows XP SP2memberxryl66923-Feb-06 4:49 
The expected text is "Hello from TestApp".
 
I think you should RTFA again.
 
Whatever is your operating system, the test app expected behaviour is that it can deadlock.
It is not a 100% sure deadlock so, maybe on your system it never deadlock because the timing is too tight on one thread (look at the testapp source code to understand what I mean) but a possible deadlock like the one that are impossible to explain without such a tool.
 
The idea is to build a deadlock detector (which is a tool that should be undetectable from the source), not modify the program behaviour.

QuestionWhere is the input focus?memberFree to Go21-Mar-05 9:39 
Pls help!
 
I am interested in knowing whether there is any chances to locate the "keyboard cursor" location. Finding the top most window is not difficult but I have no idea which control within a child window that got the keyboard focus nor where is the keyboard cursor location.
 
I wonder whether you have any ideas on
1) which child windows got the keyboard focus,
2) which control within this child window got the keyboard focus and
3) where is the exact location (x,y) the input cursor or keyboard cursor is
 

 
Regards,
 
Joe

AnswerRe: Where is the input focus?memberxryl6691-Apr-05 23:27 
I don't see the link with this article.
 
I have no idea about how to do this except by hooking WM_SETFOCUS messages and keeping track of the last window selected (but this is made with windows message hook, not API hooking)
 
Sincerly,
X-Ryl669
QuestionDeadlocks on Timeable Objects?memberBlake Miller1-Feb-05 10:18 
Can't you catch WAIT_FAILED and WAIT_TIMEOUT on the types of system calls (WaitFor*) discussed in this article?
 
So, if your thread sees WAIT_TIMEOUT then you might have some kind of 'deadlock'.
Maybe I am just 'different for never using INFINITE in the argument for the wait timeout Sniff | :^)
 
I would have thought it most important to catch EnterCriticalSection and LEaveCriticalSection, since you can not specify a timeout for EnterCriticalSection.
 
I still really like the ideas presented here, however.

AnswerRe: Deadlocks on Timeable Objects?memberxryl6691-Feb-05 22:39 
Blake Miller wrote:
Can't you catch WAIT_FAILED and WAIT_TIMEOUT on the types of system calls (WaitFor*) discussed in this article?
 
So, if your thread sees WAIT_TIMEOUT then you might have some kind of 'deadlock'.
Maybe I am just 'different for never using INFINITE in the argument for the wait timeout

In almost all my software, I use a timeout while waiting for an object. However, sometimes it is really difficult to avoid inifite loops (what to do if an object can't be taken x times, for example in a DirectShow filter waiting for more data from the network, as it is not possible to stop the application then). Anyway, using timeout is most of time far from optimal, as it leads to unexpected execution time of a thread. I believe that a good algorithm could never have to timeout a wait.

I would have thought it most important to catch EnterCriticalSection and LeaveCriticalSection, since you can not specify a timeout for EnterCriticalSection.
 
I still really like the ideas presented here, however.

You can use TryEnterCriticalSection function for Critical sections, without needing a timeout. I think however that it is a very good idea not to have timeout in critical section (because of code complexity needed to handle timeouts). By design, critical section are supposed to be kind of rare and quick.

GeneralRe: Deadlocks on Timeable Objects?memberBlake Miller3-Feb-05 4:58 
I believe that a good algorithm could never have to timeout a wait.
In an ideal world, the user never trips over the Ethernet cable, the phone line does not spontaneously disconnect, the PLC does not lose power or reboot, etc. Mostly, I always have timeouts where communications are involved so that I can recover from such situations where ongoing communication is expected, but perhaps not forthcoming. INFINITE is a long time to wait...
I suppose, then, in such cases, the fact that the thread gets some time after a waitable timeout is 'unexpected' but at least accounted for as part of the design of the system. In our process control software, I have had to rework several designs that previously used INIFNITE and led to a lot of customer dissatisfaction as a result. A thread locked at INIFNITE that will never receive the signal(s) it is waiting for is a hung thread - not deadlocked, just hung Sigh | :sigh:
AnswerRe: Deadlocks on Timeable Objects?memberBruce Ricker14-Dec-05 7:19 
Hi
 
When a CriticalSection Blocks (Which is the case we are worried about here.) Then the Kernel creates an Event and blocks the thread on that Event until the CriticalSection is released by the owner. That means that CriticalSections are automatically covered in this solution if I am not mistaken.
 
C ya
Bruce

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web01 | 2.6.130617.1 | Last Updated 10 Mar 2005
Article Copyright 2005 by xryl669
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid