As a C++ programmer, I always end up writing code for products which are either add-on applications or third-party DLLs to the main application. So as a developer, it's important for me to ensure that my code does not degrade the performance of the main application or product. Generally, I use commercial products like IBM Rational Product Suite or Bounds Checker. But at times, these commercial tools are not useful as the code (DLL) which I write relies on the main application as the main application is not at all modifiable for profiling. The main motive behind this is I should get the time taken by each function in my code and the number of times that function is called. In short, I want to do an impact analysis of my code so that I can get input for performance improvements. Thus, I can ensure that the overall performance of the main application or product is maintained.
I happen to discuss this problem with one of my friends who is also an experienced C/C++ programmer. He suggested me that the Visual Studio C++ compiler has some flags which can be used for writing function specific code. Those compiler flags are /Gh and /GH. The /Gh flag causes a call to the
_penter function at the start of every method or function, and the /GH flag causes a call to the
_pexit function at the end of every method or function. So, if I write some code in these methods to find out the caller function, then I would gather the stack trace information. Also, if I write the code to start the timer in the
_penter function and stop the corresponding timer in the
_pexit function, then I would roughly measure the time taken by the method or function to execute. But, it's very important to understand the
_pexit functions before writing any code. MSDN states that the
_pexit functions are not part of any library, and it is up to the developer to provide a definition for
_pexit. So, I decided to write my own DLL in which I would provide the definition of
_pexit, and at the same time, these two functions would be exported from the DLL. The prototype is as follows:
void __declspec(naked) _cdecl _penter( void);
void __declspec(naked) _cdecl _pexit( void);
These methods are defined as
_cdecl, which means the implementation should push the content of all the registers on entry, and pop the unchanged content on exit. Also, objects can not be instantiated inside the function body, and only global or static variables can be used inside the function body. We can only call global or static methods from inside the function body. Keeping all this in mind, I created a singleton
Profiler class which will have the necessary methods to collect the time profiling data. I used the C++ inline assembler feature to implement
_pexit. The sample implementation is as follows:
extern "C" void __declspec(naked) _cdecl _penter( void )
mov eax, esp
add eax, 32
mov eax, dword ptr[eax]
sub eax, 5
The interesting part in the implementation is how a non naked global function, i.e.,
enterFunc is called with the caller function virtual address as an argument. After writing the Prolog instruction, travel the stack by adding 32 bytes from the current stack pointer. Then, get the return address from that virtual address by a pointer operation. Now, subtract 5 bytes from that address, which will get the virtual address from the caller function body. This virtual address is passed as an argument to the global function which does further processing. The time keeping part is solved with this arrangement. But, what about the function name? How should I get the function name from a virtual address from any function body?
I solved this name problem by using the DIA (Debug Interface Access) SDK. The DIA SDK has a unified model to access or query any symbol and its properties from PDB files. So, to use this profiler, it's important, rather mandatory, that the DLL or EXE should have Debugging Information (the PDB file).
getFunc in the
Profiler class which finds out the function name from the given virtual address. The process is as follows:
- Get the current process handle using the
- Get all the loaded modules of the current process using the
- Check if the given virtual address belongs to any modules using its address space size and load address
- Get the module file path from the module handle if the given virtual address belongs to that module, using
- Load the PDB file from the module file path using the
loadDataForExe method of
- Use the
openSession method of
IDiaDataSource to get
IDiaSession and use the
put_loadAddress method of
IDiaSession to setup the symbol database for the query
- Now, query the
IDiaSession object using the
findSymbolByVA method which would return
IDiaSymbol, i.e., the function having the given virtual address in its body
- Get the function name from the
IDiaSymbol object using the
Using the Code
The Profiler described in this article uses the DIA SDK, so if you want to use this profiler, then it's mandatory that the project (LIB/DLL/EXE) to be profiled should generate debugging information.
To generate debugging information for a project, go to the respective Project's General property page which is under the C/C++ tab and set the Debug Information Format to Program Database(/Zi)
Also, for the Debugging property page which is under the Linker tab, set the Generate Debug Info to Yes(/Debug). These two settings will ensure that the PDB file is created for the project to be profiled.
After setting the project to generate a PDB file, set the /Gh and /GH flags in Additional options of the Command Line property page under the C/C++ tab, as shown below.
Add profiler.lib (export library from the profiler project) in Additional Dependencies of the Input property page under the Linker tab. This is an important setting for the profiler to work as profiler.lib/dll has the
Provide the path to profiler.lib in Additional Library Directories of the Linker property page. This setting would be developer dependent.
If the user wants to view the profiling result in the form of a CSV file, then set the PROFILER_LOG environment variable with the CSV file path. After the complete run of the main application, the profiling data is saved to the specified CSV file.
As depicted earlier, the profiler CSV file contains the executed function/method name, how many times it is called, the total time taken by the function and its child functions in milliseconds, the time taken by the function, i.e., self time, and the time taken by child functions.
The /Gh and /GH compiler flags are supported only on the Win32 platform, so the current profiler will not be useful for native 64 bit applications.
Scope for Improvement
The Simple Profiler discussed in this article is complete in itself, but it can be further improved. I can think of the following ways to improve it or make it more developer friendly:
- Can use a multimedia timer instead of the default clock to get precise time
- A memory profiler can be added
- A Visual Studio add-in or macro can be created for quick and all time profiling for a complete solution
Here is the list of help which I took while coding this article:
- 15 Dec. 2009 - Article first posted to The Code Project.
C/C++ practitioner with more than 5 years of experience in 3D Visualization.