Click here to Skip to main content
15,885,985 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I have created one text file with UTF-8 encoding, and in that file I written some Japanese characters, now I want to read this text file and display on console as well as store data in another file..
Posted
Comments
Jochen Arndt 13-Nov-14 4:53am    
The answer to this question is OS dependant because you need to call system or external library functions to convert encodings to those used by your application and by the console.
Member 10168792 13-Nov-14 5:41am    
Thanks,
But I am not aware about MultibyteByteToWideChar as well as WideCharToMultibyte, can you show this with any small example?
Member 10168792 13-Nov-14 4:56am    
Currently I am using Windows 7 Enterprise.

1 solution

The Microsoft SDK provides two functions to convert between character encodings: MultiByteToWideChar[^] and WideCharToMultiByte[^].

To simplify the code of your app, you should make it using Unicode (which is the default with recent VisualStudio versions).

Use MultiByteToWideChar to convert an UTF-8 string to wide chars. To print this to the console, it may be necessary to convert the string to the encoding used by the console (call GetConsoleOutputCP[^]). When the code page used by the console is not able to print your Japanese characters, you may change the code page using SetConsoleOutputCP[^]. In all cases you must ensure that the font used by the console contains the used characters.

With output to file you are free to use any encoding. It depends mainly on the applications that should open the file.

[EDIT according to the comment posted above]
You may have a look at the tip Handling simple text files in C/C++[^] for an example.
The general process is:

  • Get the size of the UTF-8 file
  • Allocate a buffer for the UTF-8 text
  • Open the file, read the content into the buffer, close the file
  • Call MultiByteToWideChar with CP_UTF8, lpMultiByteStr = input buffer, cbMultiByte = file size, lpWideCharStr = NULL, cchWideChar == 0 to get the length for the buffer
  • Allocate the wide char buffer using the value returned by the above call
  • Call MultiByteToWideChar again passing now the output buffer and it's size.
  • Do something with the wide string like printing to console
  • Delete the buffers if no longer needed

If you want to use the UTF-8 file content also for other purposes, you must allocate one byte more and set that to zero. This is not necessary when only using MultiByteToWideChar and passing the correct size.
 
Share this answer
 
v3

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900