
Introduction
Some countries and languages standardize on number and date formats that don't translate smoothly between cultures. It is important for C++/Windows developers to have strategies and techniques to handle this challenge and other challenges presented by diverging sets of localization API functions. The CtrSynch sample app illustrates how to keep the Windows API locale in synch with the C-runtime (CRT) locale so that functions like LoadString
are in step with conversion routines like _tprintf
.
Background
Have you ever encountered a situation where you need to read a double/floating point value from text formatted in another locale? For example, the number 1023.54 displays in English-US as 1,023.54 and in German-Germany as 1.023,54. This problem comes up often when sharing text-based information generated in Europe (Germany, France, Spain) and consumed in the US. The reverse is also true.
Say, a German company exports a Tab Separated text (TSV) file from a spreadsheet on a workstation running in the German-Germany locale. The file is emailed to an American firm, where values like 1.023,54 import as a decimal number between 1 and 2 rather than 1023. This is a very common scenario.
The first step in properly transferring double values (or dates formatted by locale defaults) is to include a locale identifier in the data. This can be accomplished using a file header, an LCID field in each data row, embedded logic in the file name, and so on. In my simple example, I just wrote a method to pack an LCID onto the end of the string containing the number. Conversely, I wrote a routine to parse it back out before reading the number. The final issue is the actual conversion of the text to doubles. My first instinct was to run SetThreadLocale
, run the _tcstod
function on the text, then return to the previous thread locale.
It doesn't work! I spent a lot of time trying to figure this out, and I hope to save you the effort!
It turns out, the C-runtime routine _tcstod
(strtod
in ANSI, wcstod
in UNICODE) gets its locale context from the C-runtime function setlocale
. SetThreadLocale
does not talk to setlocale
. Therefore, calling SetThreadLocale
without calling setlocale
puts you in a situation where LoadString
will load from the current thread locale, but _tprintf
will format in the locale the application started under. So, should you just call setlocale
at the same time you call SetThreadLocale
?
Well, I wish it was that simple! Here is what must happen in your code to keep the thread locale in step with the CRT's locale:
SetThreadLocale(1033);
setlocale(LC_ALL, "English_USA.1252");
You probably see the problem -- the two functions consume very different input parameters. After struggling with this, I found the solution is actually quite simple. It just required digging in the Windows API a bit. setlocale
has two parameters, and the second is a three token string. The first is a language, the second is a country or region, and the third is a code page identifier. It turns out, these three values are readily acquired through the Windows API GetLocaleInfo
. Therefore, given an LCID value, one may call GetLocaleInfo
to find its language name (in English), it's region (in English), and its code page. A snippet:
LPCTSTR CCrtLocaleSwitch::loadLocaleId(LCID lcid, _bstr_t& bstrRetBuf)
{
TCHAR arcBuf[128];
memset(arcBuf, 0, sizeof(arcBuf));
GetLocaleInfo( lcid, LOCALE_SENGLANGUAGE, arcBuf, 127);
bstrRetBuf = arcBuf;
memset(arcBuf, 0, sizeof(arcBuf));
GetLocaleInfo( lcid, LOCALE_SENGCOUNTRY, arcBuf, 127);
if( *arcBuf )
{
bstrRetBuf += TEXT("_");
bstrRetBuf += arcBuf;
}
memset(arcBuf, 0, sizeof(arcBuf));
if( (GetLocaleInfo( lcid, LOCALE_IDEFAULTANSICODEPAGE, arcBuf, 127)
|| GetLocaleInfo( lcid, LOCALE_IDEFAULTCODEPAGE, arcBuf, 127))
&& *arcBuf )
{
bstrRetBuf += TEXT(".");
bstrRetBuf += arcBuf;
}
return bstrRetBuf;
}
The function above creates the string that is acceptable for setlocale
. This allows you to keep the C-runtime's locale in synch with the Windows API locale state.
One final note regarding the sample application -- the sample classes are designed to restore state when they go out of scope. Regardless of how you exit a function, whether it's a normal return
or an exception event, the previous locale will be restored. For brevity, I did not always check return values when calling Windows API or CRT functions, so please bear with my laziness!
Using the code
The sample application was written in Visual C++ 7.1. The two main reusable classes, CTempLocale
and CSmartBuf
, should be compatible with other compilers. The simplest way to use these classes is to put them in a folder in your header file search path, then add the following to your stdafx.h file:
#include <comdef.h>
#include <TempLocale.h>
The application itself is rather useless, but it illustrates keeping the CRT in synch with the Windows thread locale. When you select a new culture on the left, the window caption changes to "Hello World" in the selected language. Since MFC's CString
internally calls LoadString
, this functionality gets its locale from the Windows SetThreadLocale
function. At the time the caption changes, the number in the lower left is reformatted per the selected locale, and that formatting gets its locale from the last call to the CRT setlocale
function. On the right, you can select a target culture to translate the displayed number into, and it is displayed in the lower right. This lower-right number illustrates one possible way to attach LCID info to text containing a decimal number.
Conclusion
The Windows API provides routines to load resources in the current thread's locale. The CRT provides routines to convert numbers to text and back again. The two sets of APIs don't share a locale status; therefore, C++ developers must build a way to keep them in synch. This article demonstrated a way to handle this task.
History