The Microsoft SDK provides two functions to convert between character encodings:
MultiByteToWideChar[
^] and
WideCharToMultiByte[
^].
To simplify the code of your app, you should make it using Unicode (which is the default with recent VisualStudio versions).
Use
MultiByteToWideChar
to convert an UTF-8 string to wide chars. To print this to the console, it may be necessary to convert the string to the encoding used by the console (call
GetConsoleOutputCP[
^]). When the code page used by the console is not able to print your Japanese characters, you may change the code page using
SetConsoleOutputCP[
^]. In all cases you must ensure that the font used by the console contains the used characters.
With output to file you are free to use any encoding. It depends mainly on the applications that should open the file.
[EDIT according to the comment posted above]
You may have a look at the tip
Handling simple text files in C/C++[
^] for an example.
The general process is:
- Get the size of the UTF-8 file
- Allocate a buffer for the UTF-8 text
- Open the file, read the content into the buffer, close the file
- Call
MultiByteToWideChar
with CP_UTF8
, lpMultiByteStr
= input buffer, cbMultiByte
= file size, lpWideCharStr
= NULL, cchWideChar
== 0 to get the length for the buffer - Allocate the wide char buffer using the value returned by the above call
- Call
MultiByteToWideChar
again passing now the output buffer and it's size. - Do something with the wide string like printing to console
- Delete the buffers if no longer needed
If you want to use the UTF-8 file content also for other purposes, you must allocate one byte more and set that to zero. This is not necessary when only using
MultiByteToWideChar
and passing the correct size.