Click here to Skip to main content
15,399,360 members
Articles / Programming Languages / C++
Posted 1 Dec 2005


25 bookmarked

Performance improvement for VC2005 CRT (x86/IA64)

Rate me:
Please Sign up or sign in to vote.
4.84/5 (16 votes)
6 Dec 20053 min read
In the CRT of VC2005, there is a performance decrease; this article describes the changes that are to be made to get a "faster" CRT.

Image 1


This article is based on a micro-benchmark which has nothing to do with real-world applications! The code shown here will only increase the speed of a little function (_getptd_noexit) in the CRT. The general improvements in the VC2005 compiler / CRT are very good. And I strongly recommend you to move to VS2005 as fast as you can!


In a micro-benchmark of a single CRT function which uses TLS (Thread Local Storage), you can find a performance decrease in the function. This implementation can be improved. This will increase the performance of _getptd() by about 18 %.

This article shows the possible improvements that cane done inside the _getptd_noexit() function which is called by _getptd(). To use these changes you either need to link against the static CRT or recompile the whole CRT.

This function plays a major role in the TLS (or FLS). TLS (or better FLS) is used in many places inside the CRT. For example, all the functions which need to store some internal data for subsequent calls (like strtok) or others that depend on some locale-settings. The CRT stores a pointer to an internal data-structure in the FLS to make safe all these kinds of information for each fiber/thread. So it is very important for this function to be very fast!

If you want to have this performance improvement in a service pack or future release of VC; then please vote for it!

Changes to the _getptd_noexit() function

After analyzing the call to _getptd_noexit() I found out that TlsGetValue is called twice without any purpose. This is the point that will be improved by the following code changes.

Changes to tidtable.c

A few changes need to be done in the file tidtable.c in the CRT-source directory (normally located in: %ProgramFiles%\Microsoft Visual Studio 8\VC\crt\src). You need to replace the lines 546-553 with the following code:

#ifndef _M_AMD64
  PVOID flsGetValue;
  TL_LastError = GetLastError();
  flsGetValue = FLS_GETVALUE;
  if (!flsGetValue)
     flsGetValue = _decode_pointer(gpFlsGetValue);
     TlsSetValue(__getvalueindex, flsGetValue);
  if ((ptd = ((PFLS_GETVALUE_FUNCTION)flsGetValue)(__flsindex)) == 
     TL_LastError = GetLastError();
     if ( (ptd = FLS_GETVALUE(__flsindex)) == NULL ) {

In general, you have two options for making this change, so that you can use it in your project:

  • Rebuild the CRT
  • Link against the static CRT and just recompile the tidtable.c file.

The recompilation of the CRT is explained in an article by Michael S. Kaplan and will not be discussed here. Here, I'll explain a (simple) step to add the tidtable.c file to your project and use your modified version of this file. To get this work done, you need to do the following:

  1. Create a Win32-console application.
  2. Change the project settings to use the static CRT (not the DLL version!).
  3. Remove "Precompiled-Headers" from your project.
  4. Change the linker settings to use no default libraries.
  5. Add libcmt(d).lib to the additional libs.
  6. Copy tidtable.c from the CRT-source directory into your project directory and add it to your project.
  7. Do the modification in the tidtable.c as explained above.
  8. Add the following at the top of the tidtable.c file:
    #define _CRTBLD
  9. If you have a UNICODE-build then you also need to replace "LoadLibrary" with "LoadLibraryA" and "GetModuleHandle" with "GetModuleHandleA" in the whole file.
  10. Right-click on the tidtable.c file and select Properties.
  11. Add an additional include path to this file: "$(DevEnvDir)\..\..\VC\crt\src".
  12. Do the above changes for the Debug and Release project settings.
  13. Rebuild all.

You can download the sample project where all these steps are done for you... (expect for steps 6-9 due to copyright restrictions).

I hope this (or a similar) modification finds a place in the next service pack of VS8...

Origin of this article

Starting from a German newsgroup-thread (Schleifenlaufzeit VS2003 vs. VS2005 (Mon, 28 Nov 2005 17:01:02 +0100) from Kai Huebner), I had to dig deeper to find the reason why the following (micro-benchmark) code was slow when compiled with VC2005 compared to VC2003:

double d = 0; 
for (int i=0; i<5000000; i++) 
 d += rand();

The complete analysis can be found here.

The origin of this solution

After finding out what was going "wrong" and why the implementation had changed, I tried to improve the code and reduce the number of calls to TlsGetValue and also reduce the instructions. My first version was improved (better "inlined") by "Ted". The resulting code is an object of this article...


  • 2005-12-01
    • First public release.
  • 2005-12-06
    • Added a link to the lady-bug entry.


This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


About the Author

Jochen Kalmbach [MVP VC++]
Software Developer (Senior)
Germany Germany
1982: My first computer (VC20)
1984: Finished to build my first own computer (Z80)
1993: Mission-Volunteer in Papua New Guinea
1998: Dipl. Inform. (FH)
... working, working, working....

Comments and Discussions

General__declspec(thread) Pin
valdok20-Jul-09 20:42
Membervaldok20-Jul-09 20:42 
Generalbuilding Pin
Shane A Macaulay20-Feb-09 14:13
MemberShane A Macaulay20-Feb-09 14:13 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.