Click here to Skip to main content
Click here to Skip to main content

What Every Computer Programmer Should Know About Windows API, CRT, and the Standard C++ Library

By , 22 Aug 2008
 

1. The Purpose

The purpose of this article is to clear the essential points about the Windows API, the C Runtime Library (CRT), and the Standard C++ Library (STL). It is not uncommon that even experienced developers have confusion and hold onto misconceptions about the relationship between these parts. If you ever wondered what is implemented on top of what and never had a time to figure it out, then keep reading.

2. Basics

The following diagram represents the relationship between WinAPI, CRT, and STL.

Diagram #1: The relationship between Windows API, CRT, and the C++ Standard Library

Diagram #1: The relationship between Windows API, CRT and C++ Standard Library

Adjacent blocks can communicate with each other. What does it mean? Let's go from the bottom to the top.

2.2. Hardware

Each hardware part exposes its own set of commands that enables the Operating System to control and communicate with it. The amount and complexity of the commands varies from part to part. Often, different vendors of the same part may provide additional commands beyond the requirements of a common standard. Communication with countless hardware devices with endless variety of commands would be enormous toil for software writers if they had to access it directly. Here, the Operating System comes to the rescue.

2.3 Operating System

The purpose of the OS is to encapsulate all the intricacies of the underlying hardware and provide a unified access interface to the computer's parts. No application can access the hardware directly. Only the OS can access the hardware. The part of the OS that accesses the hardware is said to run in kernel mode.

Older OSs like MS-DOS, for example, allowed programs to access hardware resources directly. Though it enabled software writers to make certain performance gains, in the long run, this technique often made the software very brittle, and incompatible with newer hardware parts.

2.4 Application Programming Interface

The OS exposes the underlying machine resources by means of an Application Programming Interface (API). An API is a uniform set of functions that enables software developers to abstract from hardware peculiarities and focus on their own goals. An application cannot bypass the OS and access hardware resources directly. It is commonly said that applications run in user mode. MS Windows provides an API as a set of C functions. The C language is chosen as the lowest common denominator for software development under the Windows platform.

2.4.1 Platform Software Development Kit

MS distributes a free Platform Software Development Kit (Platform SDK or PSDK), which enables software developers to write Windows programs. The PSDK contains:

  1. Header files with API function declarations
  2. Import lib files to link with (where calls to API functions are redirected to the relevant DLLs)
  3. Documentation
  4. Various binary helper tools

For example, to open or create a file, we call the CreateFile function, which is declared in the "WinBase.h" header file and requires the "Kernel32.lib" library to link with.

The names of Windows API functions follow the Camel case naming convention and usually are easily distinguished by this. Names of macros and constants are conventionally in uppercase. Each function always has a "Requirements" section on its documentation page where the necessary headers, import libraries, and supported OS versions are specified.

A Windows application can call any API function, provided the application follows the function's signature and links with the appropriate import library (or gets the function's address directly from the implementing DLL with the GetProcAddress call).

2.5 C Runtime Library

On top of the OS API functions, software vendors implement the C Runtime Library (CRT). CRT is a standardized set of header files and C functions which implement common tasks such as string operations, some math functions, basic input/output etc. Usually, the same vendor that makes the C compiler also provides the CRT implementation. The International Organization for Standardization [^] is responsible for the C language standard and its runtime library.

2.5.1 Standards and Extensions

Theoretically, by using only standard C functions, the developer can ensure that the same code may be used to build and run a program under any platform where a decent C compiler and CRT implementation exists. However, in practice, software vendors include many useful extensions to standard library functions, which make developers' life easier but at a price of portability.

The names of CRT functions are in lower case. The names of macros and constants are in uppercase. The names of extensions begin with the underscore character; for example, the _mkdir function. Each function always has a "Requirements" section on its documentation page where its header is specified.

2.6 Unicode Awareness

2.6.1 Platform SDK is Already Unicode Aware

Actually, the above mentioned Win32 API names are not real names. These names are mere macros that are defined in the PSDK header files. So, when the PSDK documentation mentions a function, for example CreateFile, a developer should be aware that CreateFile is a macro. The true names of the CreateFile function are CreateFileA and CreateFileW. Yes, there are two, rather than one, versions for many Win32 API functions. The version that ends with 'A' accepts ANSI character strings, i.e., strings of regular chars. Another version ends with 'W' (the so called "wide" version) and accepts Unicode character strings, i.e., strings of wchar_ts. Both versions are implemented within the kernel32.dll module. The CreateFile macro will expand into the CreateFileW name if the UNICODE symbol is defined for a project, and into the CreateFileA name otherwise.

There are three families of Windows OS: MS-DOS/9x-based, Windows CE, and Windows NT.

  1. The MS-DOS/9x-based family, which includes Windows 1.0-3.11, 95, 98, and Windows ME, is based on the MS-DOS OS. Earlier versions of Windows: 1.0-2.0 are true 16-bit OSs. Newer versions: 3.0, 95, 98, and ME are the so called hybrid 16/32-bit OSs. They are 16-bit at low level, but capable of running 32-bit programs with certain limitations. One of these limitations is that only the ANSI version of the Win32 API functions exist on this platform. Currently, the MS-DOS/9x-based family is extinct and unsupported by Microsoft.
  2. The Windows NT family started from Window NT 3.1 in early 90's and includes Windows NT 4, Windows 2000, Windows XP, Window Vista, and Server flavors of these OSs. The Windows NT family is true 32-bit. It supports both ANSI and Unicode versions of the Win32 API. The Windows NT family operates with Unicode strings internally. The ANSI version of a Win32 API function is a mere wrapper around the real worker – the Unicode version of a function.
  3. The Windows CE family is intended for mobile and embedded devices. It is true 32-bit. Windows CE supports only the Unicode version of the Win32 API.

2.6.2 PSDK Solution: TCHARs

In order to avoid multiple PSDKs for different Windows families, Microsoft implemented generic text characters or TCHARs. TCHAR and other relevant macros are defined in the WinNT.h header file. The main idea is that the developer never uses the char or wchar_t types explicitly, but uses the TCHAR macro instead. The TCHAR macro will expand into the appropriate character type depending on whether the UNICODE symbol is defined for a build. In the same manner, instead of calling the 'A' or 'W' version of a Win32 API function, the developer calls a generic macro version, which will accommodate the actual character type at compile time.

// Generic code
//
LPCTSTR psz = TEXT("Hello World!");
TCHAR szDir[MAX_PATH] = { 0 };
GetCurrentDirectory(MAX_PATH, szDir);

// What actually happens if UNICODE symbol is NOT defined for a build
//
const char* psz = "Hello World!";
char szDir[MAX_PATH] = { 0 };
GetCurrentDirectoryA(MAX_PATH, szDir);

// GetCurrentDirectoryA is a wrapper. It does the following:
// 1. Allocates temporary wchar_t buffer of given size.
// 2. Calls real worker: GetCurrentDirectoryW.
// 3. Calls WideCharToMultiByte in order to convert wchar_t string into
//    char string according to the active code page for a calling thread.
//    If some character cannot be converted, then it will be replaced with the '?' symbol.

// What actually happens if UNICODE symbol is defined for a build
//
const wchar_t* psz = L"Hello World!";
wchar_t szDir[MAX_PATH] = { 0 };
GetCurrentDirectoryW(MAX_PATH, szDir);
// direct call to real worker; no wrappers in the middle

Using TCHARs allows a developer to maintain a single code line both for ANSI and Unicode builds. Nowadays, if you do not intend to target old Windows 9x/Me platforms, you can safely forget about TCHARs and use Unicode strings everywhere and make Unicode only builds. As an added bonus, Unicode applications can forget about code pages hustle and use the same logic for all strings.

The easy way to remember PSDK string declarations is to say them loud:

            L P C T STR = const TCHAR* 
            ^ ^ ^ ^ ^ 
            | | | | | 
Long -------+ | | | | 
Pointer to ---+ | | | 
Constant -------+ | | 
TCHAR ------------+ | 
STRing -------------+

Sometimes L - "Long" is omitted, since long and short pointers are obsolete for the Win32 platform. So, typedef can look like PTSTR = "pointer to TCHAR string", which is just TCHAR*.

Here are two screenshots of the same program. The first screenshot is taken when the program is built as ANSI. The second screenshot demonstrates the Unicode build of the program.

Naive ANSI program from the 20th century.

Naive ANSI program from the 20th century. All non-English characters are converted into illegible '?' symbols.

Modern program is aware of other languages.

A modern Unicode program is aware of other languages.

2.6.3 CRT Solution: _TCHARs

Following the Platform SDK logic, Microsoft introduced generic text mapping into its C runtime library. CRT uses an additional header file to define generic character macros: "tchar.h". In order to be compliant with the requirements of the C language standard, all non-standard names start from the underscore symbol. Also, CRT uses the shorter _T() macro for literal strings instead of the longer TEXT() macro, which is defined in "WinNT.h". CRT authors decided to advance the generic text notion even further, and as a result of this decision, now CRT distinguishes three modes for text characters:

  • SBCS - The Single Byte Character Set. The classic char is used for strings. One ASCII character fits within one char element. No symbol has to be defined for a project. This is the traditional C language approach that survived from the 1970's to our days. English characters are represented with values 0x00 - 0x7F; non-English characters are represented with values 0x80 - 0xFF. The actual meaning of non-English characters is interpreted according to the currently active code page.
  • _MBCS - The Multi-Byte Character Set. The classic char is used for strings. One multi-byte symbol may require one or two char elements. The _MBCS symbol has to be defined for a project. _MBCS is backward compatible with the SBCS mode, and was the default choice for new projects in MS Visual C++ until version 8.0 (2005). _MBCS was commonly used for Eastern Asian languages like Japanese, Korean, and Chinese. Now, _MBCS is being mostly ousted by Unicode characters. Using _MBCS was the only feasible option to handle Eastern Asian languages on Windows 9x/Me platforms.
  • _UNICODE - The Unicode Character Set. The wchar_t type is used for strings. One Unicode symbol occupies one wchar_t element, which is 16-bit on the Windows platform, and can represent up to 65535 different values. This is the default mode for the new projects starting from version 8.0 (2005) of MS Visual C++.

CRT uses the _MBCS and _UNICODE symbols definition in order to distinguish between multi-byte and Unicode builds.

Diagram #2: The Generic Text Mapping in CRT
Generic-text data type or name SBCS (_UNICODE, _MBCS not defined) _MBCS defined _UNICODE defined
_TCHAR char char wchar_t
_T("Hello, World!") "Hello, World!" "Hello, World!" L"Hello, World!"
Function name prefix and example:
_tcs
_tcscat, _tcsicmp
str, _str
strcat, _stricmp
_mbs
_mbscat, _mbsicmp
wcs, _wcs
wcscat, _wcsicmp
// Generic code; names are not standard, hence the leading underscore.
//
_TCHAR message[128] = _T("The time is: ");
_TCHAR* now = _tasctime(&tm);
_tcscat(message, now);
_putts(message);

// What happens if no symbol is defined at all (SBCS).
//
char message[128] = "The time is: ";
char* now = asctime(&tm);
strcat(message, now);
puts(message);

// What happens if _MBCS symbol is defined (Multi-byte Character Set);
// non-standard names are with the leading underscore.
//
char message[128] = "The time is: ";
char* now = asctime(&tm);
_mbscat(message, now);
puts(message);

// What happens if _UNICODE symbol is defined (Unicode Character Set);
// non-standard names are with the leading underscore.
//
wchar_t message[128] = L"The time is: ";
wchar_t* now = _wasctime(&tm);
wcscat(message, now);
_putws(message);

2.7 C++ Standard Library

The C++ programming language defines its own standard library. The C++ Standard Library specifies a set of classes and functions that facilitate common programming tasks.

Often, the C++ Standard Library is referred to as STL. This abbreviation belongs to pre-standard times, and stands for Standard Template Library. The latest revision of the C++ standard STL became a subset of the C++ Standard Library. However, the term STL is still ubiquitous and used as a synonym for the C++ Standard Library.

The International Organization for Standardization [^] is responsible for the C++ language standard and its library.

2.7.1 Contents of the C++ Standard Library

The C++ Standard Library may be divided into the following major parts:

  1. Containers, where common data structures are defined, such as vector, set, list, map etc.
  2. Iterators, which provide a uniform way to operate over standard containers.
  3. Algorithms, which implement common useful algorithms. Algorithms use iterators instead of working directly with containers. That's why the same implementation of an algorithm can be used with different standard containers.
  4. Allocators, which handle memory storage allocation/deallocation for elements in containers.
  5. Function Objects and Utilities, which are helpers to algorithms and containers.
  6. Streams, which provide a uniform object oriented way of input/output.
  7. C Runtime Library. Due to the backward compatibility of C++ with the C language, CRT is incorporated into the Standard C++ Library.

2.8 Cross-platform Development

Sometimes there is a requirement that a software program will run on several computer platforms. The developer may choose to develop as many separate code bases of software as there are target platforms. However, this approach is tedious and error prone. It is also wasteful and ineffective considering development resources since the same functionality must be implemented and maintained over and over again.

The common approach is to develop a single code base for all platforms and restrict the usage of platform-dependent API functions and vendor-specific standard library extensions. It makes development harder; however, in the long run, all platforms benefit from new features and bug fixes.

3. Code Reuse

There are two ways to incorporate the CRT and/or the C++ Library code into a program: static linking and dynamic linking. In the following discussion, I will use solely the CRT term to save typing; however, these concepts are relevant both to CRT and the C++ Standard Library.

3.1 Linking Statically

When the CRT/C++ Library is linked statically, then all its code is embedded into the resulting executable image. This technique has both advantages and disadvantages.

Advantages:

  1. Simple deployment. It is enough to copy a program to the destination computer to make it run. No need to worry about complicated scenarios of CRT/C++ Library deployment.
  2. No additional files. It can be very convenient for small utility applications to comprise just of one executable file. Such self-contained applications can be easily downloaded and redistributed without the risk of breaking its integrity.

Disadvantages:

  1. Not serviceable. New versions of a library and fixes of old versions are invisible for statically linked programs.
  2. Domino Effect of static linking. In the modern world, rarely can a program pull it out all by itself. Nowadays, software programs are complex, and heavily rely on third party components and libraries. Also, a software program itself is often divided into several loosely coupled modules. Using static linking to CRT in one of them greatly reduces interoperability between modules and forces developers to fall back on the lowest common denominator, i.e., the C interface with explicit methods for the acquisition and release of resources. The following section discusses the issue in more details.

3.1.1 CRT as a Black Box

The problem is that internal CRT objects cannot be shared with other CRT instances. The memory allocated in one instance of CRT must be freed in the same instance, the file opened on one instance of CRT must be operated and closed by functions from the same instance, etc. It happens because the CRT tracks the acquired resources internally. Any attempt to free a memory chunk or read from a file via FILE* that came from another CRT instance will lead to corruption of the internal CRT state and most likely to crash.

That's why linking CRT statically obligates a developer of a module to provide additional functions to release allocated resources and a user of a module to remember to call these functions in order to prevent resource leaks. No STL containers or C++ objects that use allocations internally can be shared across modules that link to the CRT statically. The following diagram illustrates the usage of a memory buffer allocated via a call to malloc.

Diagram #2: Using memory allocated by malloc from different modules

Diagram #2: Using memory allocated by `malloc' from different modules

In the above diagram, Module 1 is linked to the CRT statically, while Modules 2 and 3 are linked to the CRT dynamically. Modules 2 and 3 can pass CRT owned objects between them freely. For example, a memory chunk allocated with malloc in Module 3 can be freed in Module 2 with free. It is because both malloc and free calls will end up in the same instance of CRT.

On the other hand, Module 1 cannot let other modules to free its resources. Everything allocated in Module 1 must be freed by Module 1. It is because only Module 1 has access to the statically linked instance of the CRT. In the above sample, Module 2 must remember to call a function from Module 1 in order to properly release the acquired memory.

3.2 Linking Dynamically

When the CRT/C++ Library is linked dynamically, only small import libraries are linked with the resulting executable image. Import libraries contain instructions for where to find the actual implementation of the CRT/C++ Library functions. On a program's start, the system loader reads these instructions and loads the appropriate DLLs into the process' address space.

Advantages:

  1. Improved Modularity. As described in previous sections, the overall modularity of a program can benefit from dynamic linking. A program can be divided into several modules while being able to pass relatively high-level objects between them.
  2. Faster start. CRT DLLs are preloaded by the system on start. Then, when a program needs to load a CRT module, no actual load occurs. It enables the system to save physical memory and reduce page swapping.

Disadvantages:

  1. Complicated deployment. CRT libraries must be redistributed and properly installed in order for a program to work. It requires writing an additional setup program and thinking out a deployment strategy.

4. Summary

The article described relationships and dependencies between the Windows API, the C Runtime Library, and the Standard C++ Library. The Windows API is the lowest operational level for user mode programs. On top of the Windows API, there is the C Runtime Library, which encapsulates and hides the Operating System differences. The Standard C++ Library provides much more functionality and also includes the CRT as an integral part. Using only standardized functions and classes allows to write cross-platform applications. Such applications require rebuild only in order to run on a new platform. No code change is required.

Both the C Runtime Library and the Standard C++ Library can be linked to statically or dynamically, depending on the application's needs. Each method has its own advantages and drawbacks.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Alex Blekhman
Software Developer
Australia Australia
Member
More than ten years of C++ native development, and counting.
 
Smile | :)

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
GeneralMy vote of 5memberqinhaihong24 Jan '13 - 5:48 
Very impressive.
QuestionExcellent article, but I respectfully disagree on static v. dynamic linkingmemberSteve Wolf2 May '12 - 6:46 
As a veteran of many software projects, over 30 years of Windows projects, I am anti-DLL. The promises of being fixable, and flexible, and low-memory use, are all well and good - but ultimately create myriad issues with incompatibility - some small bug that your software relied upon and now stops working because a DLL that you rely upon has been modified. DLL hell is a fine name for what DLLs bring to the mix.
 
Statically linking an entire project and all libraries means that what is tested stays tested. Yes, bugs aren't fixed automagically, but then bugs aren't created automatically for you either. And if all parts are statically linked, then they all share the same CRT as well - so no issues with memory management.
 
It also guards against rogue software - malware, viruses, etc., which try to charade as another DLL and get your software to load them where they can do harm. Plus, many bugs stem from folks not understanding the rules as to how to cleanly shut down a DLL - so they try to use threads that have been terminated on DLL unload = hang. And DLLs load all of their guts into RAM - so if they support 852 functions, then all 852 functions come along when you need to use two of them! Statically linked libraries only give you the exact number of functions you need - so your EXE is smaller, and total RAM consumed is less, and load times are *often better.*
 
DLLs were a great idea back in Windows 3.x - pre 32 bit MMC. Now they're a horrible legacy with folks repeating the same nonsense about their virtues when their virtues are so tainted and riddled with problems and baggage.
 
If you're building a Win32 executable, statically link the whole kit & kaboodle!
Steve Wolf

GeneralMy vote of 5memberpulkitg19 Feb '12 - 22:15 
Essential
GeneralMy vote of 5membergndnet22 Dec '10 - 17:24 
good job
QuestionCRT resource sharing and compiler versionsmemberMember 259430115 May '10 - 18:22 
Does linking dynamically to the CRT and sharing resources across modules work if the modules were generated with different compiler versions (e.g., VS 2008 and VS 2010)? Will the modules end up using the same CRT dll, or different ones?
 
Thanks for a good article.
AnswerRe: CRT resource sharing and compiler versionsmemberAlex Blekhman15 May '10 - 18:46 
Thanks for the feedback,
 
Modules will use different CRT DLL's because they are built with different compilers. VS 2008 module will use VC++ 9.0 CRT DLL's and VS 2010 module will use VC++ 10.0 CRT DLL's. The process will end up with two versions of CRT DLL's loaded in its address space.
 
However, all modules that are built with the same compiler will reuse the same CRT DLL's. But, naturally you cannot pass any resources between modules built with different compiler because CRT versions will be different, too.
 
You can pass process-wide resources though. Like file HANDLE's or memory HGLOBAL's between modules.
 
Alex
QuestionRe: CRT resource sharing and compiler versionsmembernr4christ16 May '10 - 11:08 
Thanks for the quick reply!
 
If you don't mind, let me ask you one more question. Same scenario as above (one module built with VS 2008 and another with VS 2010), but this time memory in the heap is allocated with CoTaskMemAlloc() in one module and freed with CoTaskMemFree() in another module. Since those are used for COM, and also for C# interop, I imagine there should be no problem using different compiler versions with those. Is that correct?
AnswerRe: CRT resource sharing and compiler versionsmemberAlex Blekhman16 May '10 - 13:11 
Hi,
 
Yes, it is safe to use CoTaskMemAlloc/CoTaskMemFree across modules because these functions use COM allocator inside. COM allocator guarantees the validity of calls from different appartments, so you can call these functions even if you developed the modules in different progranming languages. For instance, C++ and VB.
 
Alex
Generallink errormemberliaohaiwen8 Jan '10 - 3:10 
I work in a project. UGS give me a entry ITK_user_main. I add user32.lib to create window ui. I get the following link error
 
itk_main.obj : error LNK2019: 无法解析的外部符号 _ITK_user_main,该符号在函数 _main 中被引用
 
the link file:
 
$LINK = "link";
 
$SYSLIBS = "wsock32.lib " .
"advapi32.lib " .
"msvcrt.lib " .
"oldnames.lib " .
"kernel32.lib " .
"comdlg32.lib " .
"comctl32.lib " .
"shell32.lib " .
"winmm.lib ";
 

 
$ENV{TC_LIBRARY}\\itk_main.obj ".
"$ENV{TC_LIBRARY}\\libsyss.lib ".
"$ENV{TC_LIBRARY}\\libpom.lib ".
"$ENV{TC_LIBRARY}\\libae.lib ".
"$ENV{TC_LIBRARY}\\libappr.lib ".
"$ENV{TC_LIBRARY}\\libarchive.lib ".
GeneralRe: link errormemberAlex Blekhman8 Jan '10 - 4:55 
I don't know Chinese, but linker error LNK2019 means that _ITK_user_main symbol is undefined. In order to use ITK_user_main function you need to link with its implementation. UGS library should supply appropriate .LIB fies to link with or source code where implementation of ITK_user_main can be found.
Generalerror LNK2005: _DllMain@12 already defined in dllmain.objmemberliaohaiwen14 Dec '09 - 21:10 
I create an empty project for export dll, the entry is DllMain, then I will use the function of MFC dll, example class CString, AfxWinInit. So I will include afx.h and afxwin.h to my project. But I get linker error( vc 6.0). But I do not get link error when the same process in VC9.0.
 

 
error LNK2005: _DllMain@12 already defined in dllmain.obj
 

thanks

 
haiwen liao
GeneralRe: error LNK2005: _DllMain@12 already defined in dllmain.objmemberAlex Blekhman14 Dec '09 - 22:31 
Well, using MFC in DLL's imposes some additional requiriments on the DLL project. First of all, read this article about kinds of DLL's: Kinds of DLLs[^].
 
DLL's that use MFC called Extension DLL's. Here's the info about creating and using extension DLL's: Extension DLLs[^].
 
Also, this KB article may help: KB148652 - A LNK2005 error occurs when the CRT library and MFC libraries are linked in the wrong order in Visual C++[^].
Generalmemory wrongmemberliaohaiwen2 Dec '09 - 3:28 
// buildstruct.cpp : Defines the entry point for the console application.
//
 
#include "stdafx.h"
#include "malloc.h"
 
static int print_info( char c , int num )
{
while( num-- )
{
char *p = ( char* )malloc( (num ) * sizeof( char ) );
for ( int i = 0 ; i < num ; i++ )
{
p[i] = 'a';
}
 
//p[num+1] = '\0';
printf("%s" , p );
 
//printf("%d\n" , 1);
 
free( p );
p = NULL;
}
return 0 ;
}
int main(int argc, char* argv[])
{
//printf("Hello World!\n");
print_info( 'a' , 11 );
return 0;
}
 

what wrong have happen?
GeneralRe: memory wrongmemberAlex Blekhman2 Dec '09 - 4:14 
Your `p' string is not zero terminated. Strings in C/C++ must be zero terminatred.
 
Alex
GeneralExelent article!staffGennady Tabachnik22 Sep '09 - 4:07 
Great job buddy! It's great to see your article here!
 
Gena Tabachnik Smile | :)
GeneralRe: Exelent article!memberAlex Blekhman22 Sep '09 - 5:26 
Thanks. Smile | :)
GeneralPoor copy of MSDNmemberkilt30 Jan '09 - 13:21 
What's the goal in copying and assembling parts of MSDN documentation ?!!!
GeneralCRT implemented partially in PSDKmemberhatcat26 Aug '08 - 1:05 
Nice article. It's worth pointing out that, if you are deploying solely on Windows, the SDK offers some implementations of the CRT which are much speedier and rely on platform knowledge. For example, CopyMemory is a drop-in replacement (bar the return type) for memcpy. This article could be improved with a list of the CRT replacements implemented in the SDK (there's quite a few if I recall correctly). Indeed, although it is beyond the scope of this article, you can write C programs without actually linking to any CRT Library code. Doing this with C++ is rather harder (exception handling and static init are problematic).
GeneralRe: CRT implemented partially in PSDKmemberAlex Blekhman26 Aug '08 - 1:23 
hatcat wrote:
the SDK offers some implementations of the CRT which are much speedier and rely on platform knowledge

 
Actually, it is for several PSDK versions already that CopyMemory et al is a mere typedef for memcpy. Just press F12 in VS IDE when you have a caret on CopyMemory token.
 
Reagrding CRT-less programs. Yes, it is possible, but now it is harder to make such program/DLL than a couple of years ago. Staring from VC++2005 the compiler injects CRT calls in a generated code (like memset or _chkstk, for example) regardless whether you link with CRT or not. So you will be required to implement these routines if you don't link with CRT.
GeneralGood article. Ask for help of a question.member_Chen_Jun_19 Aug '08 - 16:52 
As we know, Microsoft provides source code of the CRT, in Visual Studio Suite and in some version of Platform SDKs. But there are so many versions of MSVCRTD.DLL along the time, how can we easily get the source code of a specific CRT version so that we can debug into it?
GeneralRe: Good article. Ask for help of a question.memberAlex Blekhman20 Aug '08 - 1:07 
We cannot get the CRT source code easily. The only source code that is available is tha one with Visual Studio. Usualy, updates for VS will update the CRT/MFC source code as well. However, if there is a different version of a CRT DLL, which is used by a process, then you out of luck. All you can do is to go to VS debugger settings and uncheck the option that requires strict correspondence between a binary image and its source code.
GeneralVery nice!memberwtwhite19 Aug '08 - 5:32 
Very nice high-level summary -- definitely would have saved me a lot of time back when I was figuring this stuff out myself...
 
One minor correction: I think your description of _MBCS and _UNICODE is a little unclear. To specify ANSI "mode" (i.e. to cause all the CRT _tcs...() functions to consider characters to be exactly one byte long), just leave _UNICODE undefined; defining the _MBCS symbol actually turns on a third mode of operation in which characters can be either 1 or 2 bytes long, depending on locale settings.
 
WTJW
GeneralRe: Very nice!memberAlex Blekhman19 Aug '08 - 6:15 
Yes, you're correct. _MBCS will cause different routine mappings for some functions. However, the code above will remain exactly the same both for _MBCS and plain ANSI build. I meant to keep it simple so I didn't want to delve into `char' vs `multibyte char sequence' discussion. Anyway, in order to prevent any misunderstanding I'll fix the paragraph very soon. Thanks for pointing this out.
GeneralTCHAR'smemberMAXS72U18 Aug '08 - 21:27 
Great article Alex,
one more thing about chapter 2.6.2 PSDK: TCHAR's:
I think you are missing to correctly size the declared var on sample below:
 
// Generic code
//
LPCTSTR psz = TEXT("Hello World!");
TCHAR szDir[sizeof(TCHAR)*MAX_PATH] = { 0 };
GetCurrentDirectory(sizeof(TCHAR)*MAX_PATH, szDir);

 
With the sizeof(TCHAR) added you will get '1' (1 byte for 1 char) if UNICODE is undefined, otherwise you will get '2' (2 bytes for 1 wchar_t) if UNICODE is defined.
In this way you'll get the correct lenght of the szDir var.
 
Thank you
 
"Take time to think, it is the source of power"

GeneralRe: TCHAR'smemberAlex Blekhman18 Aug '08 - 22:08 
No, the code is correct. The `nBufferLength' parameter of `GetCurrentDirectory' function is the length of the buffer for the current directory string in TCHARs, not in bytes. No need to multiply it by sizeof(TCHAR).
GeneralRe: TCHAR'smemberMAXS72U18 Aug '08 - 22:42 
ok, maybe the sizeof(TCHAR) on 'GetCurrentDirectory' call is wrong, but I think that the var declaration should be modified because the resulting buffers (with UNICODE or not) have different sizes.
 
Thank you
 
"Take time to think, it is the source of power"

GeneralRe: TCHAR'smemberAlex Blekhman18 Aug '08 - 22:46 
No, the buffer is perfectly OK. It is because `GetCurrentDirectory' expects the buffer length in characters, not in bytes. So, the number passed to the function correctly reflects the size of the buffer in characters (be it `char' or `wchar_t' buffer).
GeneralRe: TCHAR'smemberMAXS72U18 Aug '08 - 22:49 
ok, thank you for the correction, sorry.
 
"Take time to think, it is the source of power"

GeneralRe: TCHAR'smemberAlex Blekhman18 Aug '08 - 23:55 
You are welcome. Smile | :)
GeneralCRT resource sharing a bad ideamemberdgendreau15 Aug '08 - 6:40 
Great into to the windows platform. I do have one minor nit to pick however.
 
In section 3.1.1 "CRT As A Black Box", you advocate linking to the CRT dynamically because it allows (for example) module 3 to allocate and return a resource and module 2 to free it. While there are many reasons to decide between static and dynamic linking, this is not a good example.
 
In my experience, sharing resource ownership like this is a bad idea in practice because it leads to unnecessary coupling between your modules. Whenever I write a module that allocates a resource of some sort, that same module should also expose a way to free it.
 
For example, What would happen if you needed to compile your module 3 in debug mode but leave module 2 in release mode? Module 2 would crash when attempting to free the object returned by module 3 (allocated by the debug CRT). If your suggestion is carried too far, all modules of an entire project would have to re-compiled in debug mode in order to debug a single module.
 
Your example of module 1 is the best practice implementation in my opinion.
GeneralRe: CRT resource sharing a bad ideamemberAlex Blekhman15 Aug '08 - 7:28 
Of course there is no one single right answer to that. It depends on requirements of a project and development practices that adopted by the dev department.
 
Also, module interface makes a great difference. Consider:
 
1. A module exposes C functions only, providing both acquiring and freeing functions.
 
In that case you can link as you wish, because a user of the module is not affected in any way whether you link statically or dynamically.
 
2. A module exposes C++ classes.
 
In that case you lost your freedom. Even though you still may call it a DLL module, in practice you must perceive it as a static library. Now, linking a module that exposes C++ classes with CRT staticaly won't gain you much. The module and its user are already tightly coupled. They must use the same version of compiler and the same DLL version of CRT. Otherwise there will be big troubles at run time.
 
However, a developer still may benefit by breaking big application into several DLL modules that expose C++ classes. Development process is more convenient, since you need to rebuild only relevant DLL module instead of whole program. Also, overall modularity of a project is improved.
 
3. A module exposes only pure virtual C++ classes and provides a factory function to create instances. Instances are destroyed with explicit function call.
 
This COM-like approach may work if you restrict methods of exposed classes to built-in types and POD's. It is not much different from the first approach. In the same manner, a user does not care how you link since he/she is not affected by this.
GeneralGreat!mvpHans Dietrich14 Aug '08 - 20:10 
Very nice, tidy explanation.
 
A few minor points: instead of just calling it the "Operating System", you may as well just call it what Microsoft does: Win32.
 
Also, in section 2.6.2, you say In order to avoid multiple PSDK's for different Windows families... While a true statement, within the system DLLs there are two versions of each function, as you note elsewhere. Within each function, use of TCHAR is irrelevant - it's always either ANSI or UNICODE. So the PSDK is a "combined" PSDK, with macros choosing the function.
 
Very informative. Are you going to do an article on the .Net architecture?
 

GeneralRe: Great!memberAlex Blekhman14 Aug '08 - 20:45 
Thanks for ths feedback. Actually, I have never thought of including .Net in this overview. May be it deservers a section after all.
 
Yes, there is only one PSDK of course. What I want to say is that instead of working with two separate sets of functions for each Windows family, one can use generic macro names.
GeneralGood stuffmember emilio_grv 16 Jan '08 - 21:38 
Thanks for this - for me- 5 rate article.
 
But there's one point that should require a better qualification.
Your assertion
"Both C Runtime Library and Standard C++ Library can be linked to statically or dynamically" may sound inaccurate, being the most of the C++ Library a set of templates (it is - in fact - a set of header with no binaries, that implements C++ inline templates, that calls themselves the old CRT flat APIs).
 
Templates, in particular, due to their "lazy instantiation" nature cannot be "static or dynamically linked".
They aren't linkable at all, and are expanded in all the translation units that use them. Unless provides some libraries containing some explicit instantiation for given particular parameter values.
 
In other words, it is a subject that should be more expanded.
 

2 bugs found.
> recompile ...
65534 bugs found.
D'Oh! | :doh:


GeneralRe: Good stuffmemberAlex Blekhman17 Jan '08 - 2:21 
You are correct in general about templates instantiation. However, MS C++ Standard Library has its own DLL with some of containers and streams instantiated for `char' and `wchar_t' types. When project is built with `_STATIC_CPPLIB' definition specified, then standard C++ templates instantiated from header files instead of being imported from MSVCPxx.dll.
Questionnew/delete instead of malloc/freememberAndromeda Shun14 Jan '08 - 20:52 
Hi!
First of all, I liked your article!
 
When you talked about memory allocation using dynamically linked CRT instances I wondered if the same applies to new/delete. If I create an object in a module linked with an older version of the CRT and delete the object in a module linked with a newer version, and I have both versions of the CRT on my machine, what happens?
 
For example let's assume I have CRT v1.0 and v2.0 that both contain a class named theFoo. Now I have two modules m1 and m2, m1 using CRT v1.0, m2 using CRT v2.0. When I create an object of theFoo in m1, the constructor of theFoo defined in CRT v1.0 is called. Now when I delete the object in m2 what happens? I assume, that the destructor of theFoo defined in CRT v2.0 is called, which is not good.
 
So what we do here when working with CRT objects across module boundaries is that we wrap the CRT class. So we have a wrapper fooWrapper defined in m1. When m2 deletes the object the destructor of fooWrapper is called which calls the destructor of theFoo defined in CRT v1.0, since m1 is linked with v1.0. This is more work to do, but afaik it's safer when the CRT implementation changes.
 
Best regards,
Torsten
GeneralRe: new/delete instead of malloc/freememberAlex Blekhman17 Jan '08 - 2:23 
You're correct about passing objects across modules boundaries. Using pure abstract interfaces is a common approach for C++ objects to communicate across modules. This is how COM interfaces work, too.
GeneralRe: new/delete instead of malloc/freememberliaohaiwen14 Aug '08 - 23:01 
Can u give an example project? So Reader will be quitely clear about your discussion.
 
Thanks
haiwen liao
GeneralRe: new/delete instead of malloc/freememberAlex Blekhman14 Aug '08 - 23:22 
I think this topic deserves its own article. Putting too much infirmation in one single article will confuse many readers. Thanks for the idea, though. I think I'll write about it soon.
GeneralExcellentmemberMatthew Faithfull14 Jan '08 - 4:03 
A really good and concise article, thanks Alex. I've been trying to write basically the same article myself for a while now and after several attempts it was still only half completed, now you've done it for me and with nice diagrams as well Big Grin | :-D I hope you don't mind if I reference it heavily if I ever get around to writing the next bit I've got planned. I think the C Library is very little understood by many developers ( I used to be one of them ) and there is a lot to be gained, especially in terms of writing cross platform code, by studying it.
 
Nothing is exactly what it seems but everything with seems can be unpicked.

GeneralRe: ExcellentmemberAlex Blekhman14 Jan '08 - 4:29 
Thanks for your responce. Smile | :) Actually, I plan to add a couple of topics in the future as other posters suggested (like manifests, for example).
GeneralRe: ExcellentmemberMatthew Faithfull14 Jan '08 - 5:21 
Good, because I know next to nothing about Manifests. I've been taking the Microsoft C Runtime and the GNU C Library glibc apart for a year or so now and looking at what they do and how. I'd like of course to write my own in C++, compatible with both and completely cross platform. I can't see it happening any time soon though. Smile | :)
 
Nothing is exactly what it seems but everything with seems can be unpicked.

GeneralGood Article!memberNeil Li12 Jan '08 - 2:08 
Wink | ;) Wink | ;) Wink | ;) Wink | ;) Wink | ;)
GeneralRe: Good Article!memberAlex Blekhman12 Jan '08 - 2:27 
Thanks!
QuestionIncomplete sectionmemberValentin Ivanov11 Jan '08 - 10:59 
It looks like section 2.6 is not complete.
 
It ends with
"International Organization for Standardization ("
 

Regards,
Valentin.
GeneralRe: Incomplete sectionmemberAlex Blekhman11 Jan '08 - 11:32 
Thanks for the comment. It seems that there is some bug in CodeProject's formatting script. It often mangles whole sections if there is a URL in parentheses.
GeneralGoodmemberzengkun10010 Jan '08 - 3:05 
Good article! Smile | :) Especially the CRT part. Many programmers don't known that different CRT versions can live in one process.
 
A Chinese VC++ programmer

GeneralRe: GoodmemberAlex Blekhman10 Jan '08 - 20:42 
Thanks!
GeneralRe: GoodmemberMohammed Hossny18 Aug '08 - 15:11 
I agreeeeeee. This article is just beautiful!!! I give it a 10 Wink | ;) .
 
Yours, M. Hossny
A programmer from Egypt
 

QuestionManifests?memberIvan Kolev7 Jan '08 - 23:08 
First, a typo:
"Names if CRT functions are in lower case": if -> in
 
Second, the article is a good overview of the topic, but it could become even more useful if you cover the new "manifests" which Visual Studio 2005 introduced for Windows.
And maybe you should mention that modules which link dynamically to the CRT will experience the same problems as statically linked modules, if they link to *different versions* of the CRT (e.g. VC6, VC7, VC71, VC8+ with manifests, etc.). Manifests are supposed to fix this for VC users by forcing all manifest-enabled modules within the process to link to the newest MS CRT version installed on the machine, however, there could be older non-manifest modules which link to older CRT's... And this unfortunately happens quite often. My most recent example is the "Oracle C++ Call Interface" library which is built with VC8 and uses manifests, but depends on the C-based "Oracle Call Interface" library which is built with VC71 and uses MSVCRT 7.1, which causes strange defects occasionally...
 
It's a great idea to write such an article, it is much needed indeed. It would be nice to update it with new details as they arise.
 
Regards,
Ivan

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web04 | 2.6.130523.1 | Last Updated 22 Aug 2008
Article Copyright 2008 by Alex Blekhman
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid