5,699,997 members and growing! (20,153 online)
Email Password   helpLost your password?
Languages » C / C++ Language » General     Beginner License: The Code Project Open License (CPOL)

What Every Computer Programmer Should Know About Windows API, CRT and Standard C++ Library

By Alex Blekhman

The article explains relationships and dependencies between Windows API, CRT and Standard C++ Library.
C++ (VC6, VC7, VC7.1, VC8.0, C++), C++/CLI, C, Win32, Dev

Posted: 3 Jan 2008
Updated: 22 Aug 2008
Views: 38,198
Bookmarked: 136 times
Announcements
Loading...



Search    
Advanced Search
Sitemap
70 votes for this Article.
Popularity: 7.79 Rating: 4.22 out of 5
4 votes, 5.7%
1
2 votes, 2.9%
2
3 votes, 4.3%
3
24 votes, 34.3%
4
37 votes, 52.9%
5
Note: This is an unedited contribution. If this article is inappropriate, needs attention or copies someone else's work without reference then please Report This Article

1. The Purpose

The purpose of this article is to clear the essential points about Windows API, C Runtime Library (CRT) and Standard C++ Library (STL). It is not uncommon that even experienced developers have confusion and hold onto misconceptions about the relationship between these parts. If you ever wondered what is implemented on top of what and never had a time to figure it out, then keep reading.

2. Basics

The following diagram represents the relationship between WinAPI, CRT and STL.

Diagram #1: The relationship between Windows API, CRT and C++ Standard Library
Application User Mode
Standard C++ Library
C Runtime Library
Operating System Kernel Mode
Hardware

Adjacent blocks can communicate with each other. What does it mean? Let's go from the bottom to the top.

2.2. Hardware

Each hardware part exposes its own set of commands that enables operating system to control and communicate with it. An amount and complexity of commands varies from part to part. Often different vendors of the same part may provide additional commands beyond requirements of a common standard. Communication with countless hardware devices with endless variety of commands would be enormous toil for software writers if they had to access it directly. Here operating system comes to the rescue.

2.3 Operating System

The purpose of the OS is to encapsulate all intricacies of underlying hardware and provide unified access interface to computer's parts. No application can access hardware directly. Only OS can access hardware. The part of the OS that accesses hardware is said to run in kernel mode.

Older OS'es like MS-DOS, for example, allowed programs to access hardware resources directly. Though it enabled software writers to make certain performance gains, in the long run this technique often made the software very brittle and incompatible with newer hardware parts.

2.4 Application Programming Interface

OS exposes underlying machine resources by means of Application Programming Interface (API). API is a uniform set of functions that enables software developers to abstract from hardware peculiarities and focus on their own goals. Application cannot bypass an OS and access hardware resources directly. It is commonly said that applications run in user mode. MS Windows provides API as a set of C functions. C language is chosen as the lowest common denominator for software development under Windows platform.

2.4.1 Platform Software Development Kit

MS distributes for free Platform Software Development Kit (Platform SDK or PSDK), which enables software developers to write Windows programs. PSDK contains:

  1. Header files with API functions declarations
  2. Import Lib files to link with (where calls to API functions are redirected to relevant DLL's)
  3. Documentation
  4. Various binary helper tools

For example, to open or create a file one calls CreateFile function, which is declared in "WinBase.h" header file and requires "Kernel32.lib" library to link with.

Names of Windows API functions follow Camel case naming convention and usually are easily distinguished by this. Names of macros and constants are conventionally in uppercase. Each function always has "Requirements" section on its documentation page where necessary headers, import libraries and supported OS versions are specified.

A Windows application can call any API function, provided application follows function's signature and links with appropriate import library (or gets function's address directly from implementing DLL with GetProcAddress call.

2.5 C Runtime Library

On top of OS API functions software vendors implement C Runtime Library (CRT). CRT is a standardized set of header files and C functions, which implement common tasks, such as string operations, some math functions, basic input/output etc. Usually the same vendor that makes C compiler also provides CRT implementation. International Organization for Standardization [^] is responsible for C language standard and its runtime library.

2.5.1 Standards and Extensions

Theoretically, by using only standard C functions developer can ensure that the same code may be used to build and run a program under any platform where decent C compiler and CRT implementation exists. However, in practice software vendors include many useful extensions to standard library functions, which make developers' life easier but at a price of portability.

Names of CRT functions are in lower case. Names of macros and constants are in uppercase. Names of extensions begin from underscore character, for example _mkdir function. Each function always has "Requirements" section on its documentation page where its header is specified.

2.6 Unicode Awareness

2.6.1 Platform SDK Is Already Unicode Aware

Actually, the above mentioned Win32 API names are not real names. These names are mere macros that defined in PSDK header files. So, when PSDK documentation mentions a function, for example CreateFile, then a developer should be aware that CreateFile is a macro. True names of the CreateFile function are CreateFileA and CreateFileW. Yes, there are two, rather than one, versions for many Win32 API functions. The version that ends with 'A' accepts ANSI character strings, i.e. strings of regular char's. Another version ends with 'W' (so called "wide" version) and accepts Unicode character strings, i.e. strings of wchar_t's. Both versions are implemented within kernel32.dll module. CreateFile macro will expand into CreateFileW name if UNICODE symbol is defined for a project and into CreateFileA name otherwise.

There are three families of Windows OS: MS-DOS/9x-based, Windows CE and Windows NT.
  1. The MS-DOS/9x-based family, which includes Windows 1.0-3.11, 95, 98 and Windows ME, is based on MS-DOS OS. Earlier version of Windows: 1.0-2.0 are true 16-bit OS. Newer versions: 3.0, 95, 98 and ME are so called hybrid 16/32-bit OS'es. They are 16-bit at low level, but capable of running 32-bit programs with certain limitations. One of these limitations is that only ANSI version of Win32 API functions exist on this platform. Currently, MS-DOS/9x-based family is extinct and unsupported by Microsoft.
  2. The Windows NT family started from Window NT 3.1 in early 90's and includes Windows NT 4, Windows 2000, Windows XP, Window Vista and Server flavors of these OS'es. Windows NT family is true 32-bit OS. It supports both ANSI and Unicode versions of Win32 API. Windows NT family operates with Unicode strings internally. The ANSI version of a Win32 API function is a mere wrapper around the real worker – the Unicode version of a function.
  3. The Windows CE family is intended for mobile and embedded devices. It is true 32-bit OS. Windows CE supports only Unicode version of Win32 API.

2.6.2 PSDK Solution: TCHAR's

In order to avoid multiple PSDK's for different Windows families Microsoft implemented generic text characters or TCHAR's. TCHAR and other relevant macros are defined in WinNT.h header file. The main idea is that developer never uses char or wchar_t types explicitly, but uses TCHAR macro instead. TCHAR macro will expand into appropriate character type depends on whether UNICODE symbol is defined for a build. In the same manner, instead of calling 'A' or 'W' version of a Win32 API function, developer calls generic macro version, which will accommodate to actual character type at compile time.

// Generic code
//
LPCTSTR psz = TEXT("Hello World!");
TCHAR szDir[MAX_PATH] = { 0 };
GetCurrentDirectory(MAX_PATH, szDir);

// What actually happens if UNICODE symbol is NOT defined for a build
//
const char* psz = "Hello World!";
char szDir[MAX_PATH] = { 0 };
GetCurrentDirectoryA(MAX_PATH, szDir);

// GetCurrentDirectoryA is a wrapper. It does the following:
// 1. Allocates temporary wchar_t buffer of given size.
// 2. Calls real worker: GetCurrentDirectoryW.
// 3. Calls WideCharToMultiByte in order to convert wchar_t string into
//    char string according to the active code page for a calling thread.
//    If some character cannot be converted, then it will be replaced with the '?' symbol.

// What actually happens if UNICODE symbol is defined for a build
//
const wchar_t* psz = L"Hello World!";
wchar_t szDir[MAX_PATH] = { 0 };
GetCurrentDirectoryW(MAX_PATH, szDir); // direct call to real worker; no wrappers in the middle

Using TCHAR's allows a developer to maintain single code line both for ANSI and Unicode builds. Nowadays, if you do not intend to target old Windows 9x/Me platforms you can safely forget about TCHAR's and use Unicode strings everywhere and make Unicode only builds. As an added bonus Unicode application can forget about code pages hustle and use the same logic for all strings.

The easy way to remember PSDK string declarations is to say them loud:
            L P C T STR = const TCHAR* 
            ^ ^ ^ ^ ^ 
            | | | | | 
Long -------+ | | | | 
Pointer to ---+ | | | 
Constant -------+ | | 
TCHAR ------------+ | 
STRing -------------+ 
Sometimes L - "Long" is omitted, since long and short pointers are obsolete for Win32 platform. So, typedef can look like PTSTR = "pointer to TCHAR string", which is just TCHAR*.

Here are two screenshots of the same progam. First screenshot is taken when the program is built as ANSI. Second screenshot demonstrates Unicode build of the program.

Naive ANSI program from the 20th century.
Naive ANSI program from the 20th century. All non-English characters are converted into illegible '?' symbols.

Modern program is aware of other languages.
Modern Unicode program is aware of other languages.

2.6.3 CRT Solution: _TCHAR's

Following Platform SDK logic Microsoft introduced generic text mapping into its C runtime library. CRT uses additional header file to define generic character macros: "tchar.h". In order to be compliant with requirements of the C language standard all non-standard names start from the underscore symbol. Also, CRT uses shorter _T() macro for literal strings instead of longer TEXT() macro, which is defined in "WinNT.h". CRT authors decided to advance generic text notion even further and as a result of this decision now CRT distinguishes three modes for text characters:

  • SBCS - Single Byte Character Set. Classic char is used for strings. One ASCII character fits within one char element. No symbol has to be defined for a project. This is the traditional C language approach that survived from the 1970's to our days. English characters are represented with values 0x00 - 0x7F; non-English characters are represented with values 0x80 - 0xFF. The actual meaning of non-English characters is interpreted according to currently active code page.
  • _MBCS - Multi-Byte Character Set. Classic char is used for strings. One multi-byte symbol may require one or two char elements. The _MBCS symbol has to be defined for a project. _MBCS is backward compatible with SBCS mode and was the default choice for new projects in MS Visual C++ until version 8.0 (2005). _MBCS was commonly used for Eastern Asian languages like Japanese, Korean and Chinese. Now _MBCS is being mostly ousted by Unicode characters. Using _MBCS was the only feasible option to handle Eastern Asian languages on Window 9x/Me platform.
  • _UNICODE - Unicode Character Set. The wchar_t type is used for strings. One Unicode symbol occupies one wchar_t element, which is 16-bit on Windows platform and can represent up to 65535 different values. This is the default mode for new projects starting from version 8.0 (2005) of MS Visual C++.

CRT uses _MBCS and _UNICODE symbols definition in order to distinguish between multi-byte and Unicode builds.

Diagram #2: The Generic Text Mapping in CRT
Generic-text data type or name SBCS (_UNICODE, _MBCS not defined) _MBCS defined _UNICODE defined
_TCHAR char char wchar_t
_T("Hello, World!") "Hello, World!" "Hello, World!" L"Hello, World!"
Function name prefix and example:
_tcs
_tcscat, _tcsicmp
str, _str
strcat, _stricmp
_mbs
_mbscat, _mbsicmp
wcs, _wcs
wcscat, _wcsicmp


// Generic code; names are not standard, hence the leading underscore.
//
_TCHAR message[128] = _T("The time is: ");
_TCHAR* now = _tasctime(&tm);
_tcscat(message, now);
_putts(message);

// What happens if no symbol is defined at all (SBCS).
//
char message[128] = "The time is: ";
char* now = asctime(&tm);
strcat(message, now);
puts(message);

// What happens if _MBCS symbol is defined (Multi-byte Character Set);
// non-standard names are with the leading underscore.
//
char message[128] = "The time is: ";
char* now = asctime(&tm);
_mbscat(message, now);
puts(message);

// What happens if _UNICODE symbol is defined (Unicode Character Set);
// non-standard names are with the leading underscore.
//
wchar_t message[128] = L"The time is: ";
char* now = _wasctime(&tm);
wcscat(message, now);
_putws(message);

2.7 C++ Standard Library

C++ programming language defines its own standard library. C++ Standard Library specifies a set of classes and functions that facilitate common programming tasks.

Often C++ Standard Library is referred as STL. This abbreviation belongs to pre-standard times and stands for Standard Template Library. Since latest revision of C++ standard STL became a subset of C++ Standard Library. However, the term STL is still ubiquitous and used as a synonym for C++ Standard Library.

International Organization for Standardization [^] is responsible for C++ language standard and its library.

2.7.1 Contents of C++ Standard Library

C++ Standard Library may be divided into following major parts:

  1. Containers, where common data structures are defined, such as vector, set, list, map etc.
  2. Iterators, which provide uniform way to operate over standard containers.
  3. Algorithms, which implement common useful algorithms. Algorithms use iterators instead of working directly with containers. That's why the same implementation of an algorithm can be used with different standard containers.
  4. Allocators, which handle memory storage allocation/deallocation for elements in containers.
  5. Function Objects and Utilities, which are helpers to algorithms and containers.
  6. Streams, which provide uniform object oriented way of input/output.
  7. C Runtime Library. Due to backward compatibility of C++ with C language, CRT is incorporated into Standard C++ Library.

2.8 Cross-platform Development

Sometimes there is requirement that software program will run on several computer platforms. Developer may choose to develop as many separate code bases of software as there are target platforms. However, this approach is tedious and error prone. It is also wasteful and ineffective considering development resources since the same functionality must be implemented and maintained over and over again.

The common approach is to develop single code base for all platforms and restrict the usage of platform-dependent API functions and vendor-specific standard libraries extensions. It makes development harder, however, in the long run all platforms benefit from new features and bug fixes.

3. Code Reuse

There are two ways to incorporate CRT and/or C++ Library code into a program: 1) static linking and 2) dynamic linking. In the following discussion I will use solely CRT term to save typing, however these concepts are relevant both to CRT and C++ Standard Library.

3.1 Linking Statically

When CRT/C++ Library linked statically, then all its code is embedded into resulting executable image. This technique has both advantages and disadvantages.

Advantages:

  1. Simple deployment. It is enough to copy a program to destination computer to make it run. No need to worry about complicated scenarios of CRT/C++ Library deployment.
  2. No additional files. It can be very convenient for small utility application to comprise just of one executable file. Such self-contained application can be easily downloaded and redistributed without the risk to break its integrity.

Disadvantages:

  1. Not serviceable. New versions of a library and fixes of an old verions are invisible for statically linked programs.
  2. Domino Effect of static linking. In the modern world rarely a program can pull it out all by itself. Nowadays software programs are complex and heavily rely on 3rd party components and libraries. Also, a software program itself is often divided into several loosely coupled modules. Using static linking to CRT in one of them greatly reduces interoperability between modules and forces developer to fall back on lowest common denominator, i.e. C interface with explicit methods for acquisition and release of resources. Following section discusses the issue in more details.

3.1.1 CRT As A Black Box

The problem is that internal CRT objects cannot be shared with other CRT instances. The memory allocated in one instance of CRT must be freed in the same instance, the file opened on one instance of CRT must be operated and closed by functions from the same instance, etc. It happens because CRT tracks acquired resources internally. Any attempt to free memory chunk or read from file via FILE* that came from other CRT instance will lead to corruption of internal CRT state and most likely to crash.

That's why linking CRT statically obligates a developer of a module to provide additional functions to release allocated resources and a user of a module to remember to call these functions in order to prevent resource leaks. No STL containers or C++ objects that use allocations internally can be shared across modules that link to CRT statically. Following diagram illustrates the usage of a memory buffer allocated via call to malloc.

Diagram #2: Using memory allocated by malloc from different modules Diagram #2: Using memory allocated by `malloc' from different modules

In the above diagram Module 1 is linked to CRT statically, while Modules 2 and 3 are linked to CRT dynamically. Modules 2 and 3 can pass CRT owned objects between them freely. For example, a memory chunk allocated with malloc in Module 3 can be freed in Module 2 with free. It is because both malloc and free calls will end up in the same instance of CRT.

On the other hand, Module 1 cannot let other modules to free its resources. Everything allocated in Module 1 must be freed by Module 1. It is because only Module 1 has an access to statically linked instance of CRT. In the above sample Module 2 must remember to call a function from Module 1 in order to properly release acquired memory.

3.2 Linking Dynamically

When CRT/C++ Library linked dynamically, then only small import libraries are linked with resulting executable image. Import libraries conatin instructions where to find actual implementation of CRT/C++ Library functions. On program's start system loader reads these instructions and loads appropriate DLL's into process' address space.

Advantages:

  1. Improved Modularity. As described in previous sections overall modularity of a program can benefit from dynamic linking. Program can be divided into several modules while being able to pass relatively high-level objects between them.
  2. Faster start. CRT DLL's are preloaded by the the system on start. Then when a program needs to load CRT module, no actual load occurs. It enables the system to save physical memory and reduce page swapping.

Disadvantages:

  1. Complicated deployment. CRT libraries must be redistributed and properly installed in order program to work. It requires a writing of additional setup program and thinking out deployment strategy.

4. Summary

The article described relationships and dependencies between Windows API, C Runtime Library and Standard C++ Library. Windows API is a lowest operational level for user mode programs. On top of Windows API there is C Runtime Library, which encapsulates and hides operating system differences. Standard C++ Library provides much more functionality and also includes CRT as an integral part. Using only standardized functions and classes allows to write cross-platform applications. Such applications require rebuild only in order to run on new platform. No code change is required.

Both C Runtime Library and Standard C++ Library can be linked to statically or dynamically, depending on application's needs. Each method has its own advantages and drawbacks.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Alex Blekhman



Occupation: Software Developer
Location: Israel Israel

Other popular C / C++ Language articles:

Article Top
Sign Up to vote for this article
You must Sign In to use this message board.
FAQ FAQ Noise ToleranceSearch Search Messages 
 Layout  Per page   
 Msgs 1 to 25 of 49 (Total in Forum: 49) (Refresh)FirstPrevNext
GeneralCRT implemented partially in PSDKmemberhatcat2:05 26 Aug '08  
GeneralRe: CRT implemented partially in PSDKmemberAlex Blekhman2:23 26 Aug '08  
GeneralGood article. Ask for help of a question.member_Chen_Jun_17:52 19 Aug '08  
GeneralRe: Good article. Ask for help of a question.memberAlex Blekhman2:07 20 Aug '08  
GeneralVery nice!memberwtwhite6:32 19 Aug '08  
GeneralRe: Very nice!memberAlex Blekhman7:15 19 Aug '08  
GeneralTCHAR'smemberMAXS72U22:27 18 Aug '08  
GeneralRe: TCHAR'smemberAlex Blekhman23:08 18 Aug '08  
GeneralRe: TCHAR'smemberMAXS72U23:42 18 Aug '08  
GeneralRe: TCHAR'smemberAlex Blekhman23:46 18 Aug '08  
GeneralRe: TCHAR'smemberMAXS72U23:49 18 Aug '08  
GeneralRe: TCHAR'smemberAlex Blekhman0:55 19 Aug '08  
GeneralCRT resource sharing a bad ideamemberdgendreau7:40 15 Aug '08  
GeneralRe: CRT resource sharing a bad ideamemberAlex Blekhman8:28 15 Aug '08  
GeneralGreat!mvpHans Dietrich21:10 14 Aug '08  
GeneralRe: Great!memberAlex Blekhman21:45 14 Aug '08  
GeneralGood stuffmember emilio_grv 22:38 16 Jan '08  
GeneralRe: Good stuffmemberAlex Blekhman3:21 17 Jan '08  
Questionnew/delete instead of malloc/freememberAndromeda Shun21:52 14 Jan '08  
GeneralRe: new/delete instead of malloc/freememberAlex Blekhman3:23 17 Jan '08  
GeneralRe: new/delete instead of malloc/freememberliaohaiwen0:01 15 Aug '08