While preparing your application to support localization, probably the most tedious step consists of extracting the string literals from the source code and adding the code that loads these strings from the resource string table.
Adding the code to extract the strings is not difficult. But since it must be performed hundreds of times in an application, we need to simplify the code as much as possible. That can save us hours of boring work. Also, the resulting code should be as readable as possible. Which gives us a second good reason to use the simplest possible code.
This article features two small yet very helpful classes that do just that: load string literals from the String table with as little and non-intrusive calling code as possible.
It also covers a few string formatting considerations related to localization requirements.
Why should I use the string table?
While preparing for localization and translation of your app, all translatable items (such as strings) must be stored in the resources instead of in the source code, in order to keep translators away from your source code. Localization tools such as appTranslator can easily manipulate your application resources, but you don't want them to play with your source code!
Therefore, all the strings that need to be translated must be stored in the resources. And the String table is the appropriate location.
Note: You may want to store strings in the Message table, which is an alternative resource that looks much like the String table. Although originally designed for better localization support, it is less often used. In addition, Visual Studio does not provide a custom editor for the Message table, making it much less appealing. Daniel Lohmann wrote an excellent article about the Message table.
The LoadString() API: Not straightforward enough
Programmatically extracting a string from the String table involves calling the
LoadString() API. Using the raw API is extremely tedious and boring. And since most serious apps contain hundreds or thousands of string literals, one absolutely needs a wrapper that's as easy and straightforward to use as possible.
Note: In addition, the
LoadString() API suffers from a design limitation: it cannot give you the size of the string that you want to load, making it difficult to properly allocate a buffer to load the string.
CString is helpful in this regard: it provides with its own
LoadString() wrapper that somehow simplifies the process. The sample code below loads a string from the String table and uses it to set the current window text:
It's better than the raw API version. But it's still too long!
CMsg: A one-liner!
CMsg class is a simple
CString wrapper that takes a string ID in its constructor. The additional trick consists of using the temporary object concept of C++. (The object is created, used and destroyed within the scope of the function call.) That way, you don't even need to write an additional line of code to declare the object. After all, you need to pass the string to the function (here:
SetWindowText()). You need it neither before nor after the function call!
That's all you need to know to use
CMsg! Since it's a
CString, it has an
LPCTSTR operator, which means it can be used wherever you usually use a string literal, a pointer to a raw string or a
Of course, if the use of a string is not limited to a function call, you can create an explicit object for it:
Luis Barreira mentioned (in the comments section) that
CString has a constructor that does the same thing as
CMsg. Simply, it's not well documented:
Good catch, Luis! However, bear with me because the second class I introduce (below) is even more useful.
By the way, why such a cryptic name as CMsg?
CStringEx? Or anything more meaningful? Well, again, the idea is that you are going to use this class hundreds or maybe thousands of times in your app. You probably want a short name, which means less typing. Of course, if you don't like it (or if it creates a name collision with some of your code), you can easily rename the class.
What if the string ID is incorrect?
In such a case, the
CMsg object contains the string "???" and debug builds ASSERT.
CFMsg: A one-liner sprintf() wrapper
Very often, one needs to format a message before passing it to a function. And since we're speaking of translatable strings, the formatting message must be loaded from the String table as well. Let's re-use our example above and set a more elaborated window text:
CString strMyTitle, strMyTitleFormat;
Pfeeew... four lines instead of one because of a variable parameter in the string :-(
CMsg has a sister that will help us with formatted messages:
CFMsg is very similar to
CMsg. Its constructor can take a variable list of arguments to format the string (a la
SetWindowText( CFMsg(IDS_MYTITLE, LPCTSTR(strCity)) );
Now, if you think this is too condensed and would rather want the message built outside of the
SetWindowText() statement, you can of course split the code:
CFMsg csTitle(IDS_MYTITLE, LPCTSTR(strCity));
SetWindowText( csTitle );
Actually, the paragraph title above is misleading.
CFMsg is not an
sprintf wrapper. It is rather a
FormatMessage() wrapper. This slightly affects the way formatting specifiers are written, as we will see below.
Localization requires formatted string arguments to be numbered
A format message looks much like the
sprintf but is more localization friendly. It adds arguments numbering to the formatting string. This is important to ensure that the translated strings are formatted correctly because words are often ordered differently in different languages.
E.g.: In French, adjectives usually come after nouns, as opposed to English.
|English:||The quick brown fox jumped over the lazy dog|
|French:||Le rapide renard brun sauta au-dessus du chien paresseux|
A formatting string with variables for color and name of the animal can be translated only by using such an argument numbering:
|English:||The quick %1 %2 jumped over the lazy dog|
|French:||Le rapide %2 %1 sauta au-dessus du chien paresseux|
Format specification : FormatMessage/CFMsg vs. sprintf
If you're used to
sprintf format specifiers (and you surely are!), don't worry about the syntax change in
CFMsg. They are really simple:
%1: The first parameter (string).
%2: The second parameter (string).
%n: The nth parameter (string).
%1!d!: The first parameter (decimal integer). Use notation
%n!spec! for the nth argument where you would use
FormatMessage() arguments are not limited to strings!
People often believe that
FormatMessage() arguments are always strings. This is wrong. You can use any
sprintf-like specifier. Simply, the specifier must be enclosed in exclamation marks. E.g.:
%1!d! (replace 1 by the correct argument number).
Update: Well, not _every_ format specifier is supported: floating-point specifiers (e, E, f, and g) are not supported.
How to adapt your existing strings to take advantage of CFMsg?
That's simple. Number the arguments in their current order. If an argument is not
%s, enclose the specifier
%s is %u years old becomes
%1 is %2!u! years old.
CFMsg is your friend even if you don't use the String table
We've seen that the first argument of the
CFMsg() constructor is the ID of the formatting string in the String table. Actually, there are two versions of this constructor: one that takes a string ID and one that takes a string literal. It means that you can use
CFMsg even if you don't intend to store the string into the String table.
SetWindowText(CFMsg(_T("The weather today in %1"), LPCTSTR(strCity)));
Using the code
Simply add Msg.h and Msg.cpp to your project.
Include Msg.h in .cpp files where you would use the class. I recommend you to include Msg.h in stdafx.h as you will most likely use
CFMsg in a lot of .cpp files.
Store every string literal in the String table (using the string editor) and replace it in the source code by
x is the ID of the string you have just created.
The demo project
You can see
CFMsg in action in the demo dialog-based project. The top part of the dialog demonstrates the use of
CMsg. It loads and displays a string whose number is specified in the ComboBox. The bottom part of the dialog demonstrates the use of
CFMsg. User inputs are the arguments of the formatted string.
The zip file contains VC6 and VC7 project files. The compiled EXE enclosed in the zip file was compiled using VC .NET 2003, hence requires MFC71.dll. If you are using VC6, you may need to re-compile the project.
You don't always need CMsg
Be aware that some MFC functions of class members, such as
AfxMessageBox(), exist in two flavors: one that takes an
LPCTSTR argument and one that takes a string ID. In such a case, you don't even need
AfxMessageBox(_T("Operation Failed."), MB_ICONERROR);
After extraction of the string, the code becomes:
However, you may want to keep using
CFMsg if the string has to be formatted:
AfxMessageBox( CFMsg(IDS_ERROR, LPCTSTR(strError)), MB_ICONERROR);
There are a lot of string literals used in a program. Most of them are used only once.
Common practice leads us to think that strings exported to the String table should have a symbolic identifier rather than simply a raw numerical ID. However, this practice hits a limit when used with the String table: there are so many strings that finding an easy-to-use, self-speaking and unique identifier for each string quickly becomes a nightmare. Developers have no other choice than create identifiers that are kind of copies of the string literal, with adapted syntax (such as underscores instead of spaces). These identifiers are (very) long and particularly difficult to manipulate.
I suggest you to drop identifiers for such strings (except for the ones used several times throughout the source code!). Instead of a symbol, use the raw numerical string ID in the source code and append a copy of the string as a comment at the end of the line.
pWnd->SetWindowText("Please enter your name");
After extraction of the text to the String table, modify the source code to:
I successfully used this method with thousands of string literals. It's easier and much quicker to code, yet more readable. Experience shows that teammates can very easily read each other's code when they consistently use this technique.
Of course, your own coding style and practice may vary.
There is no magic in
CFMsg. They are just a few lines of code. But they dramatically increase productivity when time comes to extract string literals from the source code and store them to the String table, which is a mandatory task to prepare for localization of your app.