Click here to Skip to main content
Click here to Skip to main content

Tagged as

Go to top

Compiler Internals - How Try/Catch/Throw are Interpreted by the Microsoft Compiler

, 31 Mar 2011
Rate this:
Please Sign up or sign in to vote.
An article on how Try/Catch/Throw are interpreted by the Microsoft Compiler

Introduction

Before I start with my article, I would like to mention some important things. The most important fact you should always keep in mind is that this article is not (and should not be) a detailed reference on x86 exception handling, but a first introduction on it. While some things probably are not exactly represented as they are done in the real world, the article sheds some light on how it is conceptually done. If you would care about every detail and explain everything in detail, you most likely will be writing an article for years.

After seeing that a lot of people (mostly coming from Java and not knowing that the compiler error that an exception is not caught is coming from the fact that the developer stuck the possible exceptions in the function doc) think that try - catch is a concept which is completely analyzed during compile time and therefore won't have big impact at runtime, I thought I might shed some light on how the Microsoft compiler (cl.exe) acts with try, catch and throw. First off, we will have a look at how a throw statement is interpreted by the compiler. Let's have a look at the following code:

int main()
{
 try
 {
  throw 2;
 }
 catch(...)
 {
 }
}

Throw

For now, we are only interested in the throw 2. When the compiler hits the throw statement, it actually has no clue if the exception it's now converting is handled by an exception handler (and it doesn't care). The throw statement will be converted into a call to _CxxThrowException (exported by MSVCR100.dll (or any other version)). That function is a built in function in the compiler. You can call it yourself if you like Wink | ;) . The first parameter of that function is a pointer to the object thrown. Therefore it gets clear, that the code above definitely expands to the following:

int main()
{
 try
 {
  int throwObj = 2;
  throw throwObj;
 }
 catch(...)
 {
 }
}

The second parameter of _CxxThrowException holds a pointer to a _ThrowInfo object. _ThrowInfo is also a built in type of the compiler. It's a struct holding various information about the type of exception that was thrown. It looks like that:

typedef const struct _s__ThrowInfo
{
 unsigned int attributes;
 _PMFN pmfnUnwind;
 int (__cdecl*pForwardCompat)(...);
 _CatchableTypeArray *pCatachableTypeArray;
} _ThrowInfo;

Here the important thing is the _CatchableTypeArray. It holds a set of runtime type information of the types that are catchable within this throw. In our case, that's pretty simple. The only catchable type is typeid(int). Let's say you have a class derived from std::exception called my_exception. If you now throw an object of type my_exception, you will have two entries in pCatchableTypeArray. One of them is typeid(my_exception) and the other is typeid(std::exception).

The compiler now fills the _ThrowInfo object as a global variable (and all the other objects needed). In the above case, this is done in the following way:

_TypeDescriptor tDescInt = typeid(int);
_CatchableType tcatchInt = 
{
 0,
 &tDescInt,
 0,
 0,
 0,
 0,
 NULL,
};
_CatchableTypeArray tcatchArrInt = 
{
 1,
 &tcatchInt,
};
_ThrowInfo tiMain1 = 
{
 0,
 NULL,
 NULL,
 &tcatchArrInt
};

You see that that's a pretty lot of information stored just for the throw 2. So finally our above code expands to:

_TypeDescriptor tDescInt = typeid(int);
_CatchableType tcatchInt = 
{
 0,
 &tDescInt,
 0,
 0,
 0,
 0,
 NULL,
};
_CatchableTypeArray tcatchArrInt = 
{
 1,
 &tcatchInt,
};
_ThrowInfo tiMain1 = 
{
 0,
 NULL,
 NULL,
 &tcatchArrInt
};
int main()
{
 try
 {
  int throwObj = 2;
  _CxxThrowException(&throwObj, &tiMain1);
 }
 catch(...)
 {
 }
}

Inside _CxxThrowException now the following happens: RaiseException is called. But first the necessary parameters are created. The exception code for an exception thrown by _CxxThrowException is 0xE06D7363. It also passes 3 parameters to RaiseException. A magic number, the pointer to the object thrown and the pointer to the _ThrowInfo. Resulting in the following pseudo code:

__declspec(noreturn) void __stdcall __CxxThrowException(void* pObj, _ThrowInfo* pInfo)
{
 struct { unsigned int magic; void* object, _ThrowInfo* info } Params;
 Params throwParams = 
 {
  0x19930520,
  pObj,
  pInfo
 }
 RaiseException(0xE06D7363, 1, 3, (const ULONG_PTR*)&throwParams);
}

Now, we basically know how throw is handled by the compiler and we also see that in the end what you will notice is something like if you have encountered an access violation as they are also invoked by RaiseException.

Catch

Ok, if we now go further and inspect the try and catch, there should be a bell ringing like crazy and it should be yelling "Wait!! You say that the throw gets transformed into a call to RaiseException like it's for access violations, 0 divides and so on?! But they cannot be caught with try-catch!". And yes, you are right, they can't and that's way try - catch in fact gets transformed to a __try __except but in a special form. In code, it would look somehow like that (it's not real code, just theory):

unsigned long __stdcall mainHandler1(LPEXCEPTION_POINTERS info)
{
 if(info->ExceptionRecord->ExceptionCode != 0xE06D7363)
  return EXCEPTION_CONTINUE_SEARCH;
 if(WeHaveAHandlerForThisTypeSomeWhere(info->ExceptionRecord))
  return EXCEPTION_EXECUTE_HANDLER;
 return EXCEPTON_CONTINUE_SEARCH;
}
/* The stuff with _ThrowInfo comes here, omitted for readability */
int main()
{
 __try
 {
  int throwObj = 2;
  _CxxThrowException(&throwObj, &tiMain1);
 }
 __except(mainHandler1(GetExceptionInfo())
 {
 }
}

But that's not all! Somewhere we need to store which types of exceptions we can catch using our catch-statement. In fact, the catch(int) gets transformed into an own function (actually only a function chunk where the runtime jumps using jmp not a real function called with call) which looks like that (now it's really pseudocode because I cannot really translate it to C as it misses some information which would blow up the whole thing)

_s_FuncInfo* info = mainCatchBlockInfo1;
__asm { mov eax, info } 	// It's used for the following function as argument 
			// and passed through eax
goto CxxFrameHandler3;

The _s_FuncInfo is now again a structure that is built into the compiler. It would make the article to big to explain everything like I did for the _ThrowInfo. In short, it holds information for every type that can be caught in the current block. This consists (beneath other stuff) of runtime type information for every type and for each of them also has the address of the actual code that is inside the catch-block.

Ok, now what is CxxFrameHandler3 doing? This is pretty simple:

  1. It rejects exceptions that don't have 0xE06D7363 as code (which stands for C++ exceptions).
  2. It searches through the _s_FuncInfo structure to find a type witch matches with one of the types it gets from the exception objects _CatchableTypeArray.
  3. If it gets a match, it indicates that there is a handler read.
  4. If there is no match, it instructs the OS to search in the next frame.

To finish the catch-part, all we now need is the actual handler code. This code also is transformed into a function chunk (not a complete function). It actually is transformed into the chunk that ends a function. In code, it would look like that:

// execute handler code
return addressWhereToContinueAfterCatch;

The operating system gets the address where it should jump to when it has again set up the original context and performs that jump. An example:

catch(...)
{
}
MessageBox(0, L"Ello!", L"", MB_OK);

Gets translated into the following assembler code:
.text:00401088 $LN16:
.text:00401088                 mov     eax, offset $LN9
.text:0040108D                 retn
.text:0040108E ; ------------------------------------------------------------------------
.text:0040108E
.text:0040108E $LN9:                                   ; DATA XREF: _main:$LN16 o
.text:0040108E                 push    0               ; uType
.text:00401090                 push    offset Caption  ; lpCaption
.text:00401095                 push    offset Text     ; "Ello!"
.text:0040109A                 push    0               ; hWnd
.text:0040109C                 call    ds:__imp__MessageBoxW@16 ; MessageBoxW(x,x,x,x)

You see that it returns $LN9 in eax which is the address of the call to MessageBox. And $LN16 is the address of the catch block which is referenced in the _s_FuncInfo somewhere.

Try

All that remains now is the try part. Here it's no longer the compiler that can "decide" how to do things because now it's the operating system that says how it works.

Inside the Thread Information Block, the first field (fs:[0]) holds a pointer to a linked list of exception handlers (in our case, it's the address of the part where it goes to CxxFrameHandler3). Now what try does is it adds the catch-block to the linked list. After the RaiseException call, we arrive in the function KiUserExceptionDispatcher. This function does a lot of work but in the end the important thing is that it loads the current linked list from the TIB using FS:[0] and loops through it to find a handler that says that it could handle the exception and calls its handler. If you want to browse through the currently attached handlers, you do the following:

struct LinkedExceptionFrame
{
 LinkedExceptionFrame* pPrevious;
 void* pFunction;
};
LinkedExceptionFrame* pCur = NULL;
__asm
{
 mov eax, fs:[0]
 mov pCur, eax
}
while((DWORD)pCur != 0xFFFFFFFF)
{
 std::cout << pCur->pFunction << std::endl;
 pCur = pCur->pPrevious;
}

Now we have all the basic concepts we need to understand that try/catch/throw is not as trivial as most people think and that most things are actually handled at runtime (though a huge amount of additional data and function overhead is made to catch the correct type of exception). There is way more we could talk about (for example: What if we have parts of our frame protected by try-catch and others not or if we even have more than one try-catch-block and so on. But I think so far the most important things are said!

Some tips if you like to browse through it using a disassembler and a debugger:

Use Release build, but disable any kind of code optimization. So you don't have all the register checks at the beginning and the end of function calls but your code is not getting rearranged by the optimizer so you can better compare it to the source. And it's a good thing to disable Dynamic Base (ASLR) in the linker options (under Advanced).

History

  • 31st March, 2011: Initial post

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Yanick Salzmann
Student
Switzerland Switzerland
No Biography provided
Follow on   Twitter

Comments and Discussions

 
Generalgood article PinmemberCIDev11-Apr-11 5:01 
GeneralMy vote of 5 PinmemberMatthiasRabald7-Apr-11 5:38 
QuestionOverall Cost PinmemberManjit Dosanjh3-Apr-11 21:58 
AnswerRe: Overall Cost PinmemberYanick Salzmann4-Apr-11 7:28 
GeneralMagic Number PinmemberDaveAuld31-Mar-11 7:14 
GeneralRe: Magic Number PinmemberYanick Salzmann31-Mar-11 7:36 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web02 | 2.8.140916.1 | Last Updated 31 Mar 2011
Article Copyright 2011 by Yanick Salzmann
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid