Click here to Skip to main content
13,295,616 members (54,874 online)
Click here to Skip to main content
Add your own
alternative version


38 bookmarked
Posted 16 May 2008

UTF16 to UTF8 to UTF16 simple CString based conversion

, 16 May 2008
Rate this:
Please Sign up or sign in to vote.
Use CString to convert betwen UTF8 and UTF16.


For conversion of strings between UTF8 and UTF16 (as well as other formats), Microsoft gives us the MultiByteToWideChar and WideCharToMultiByte functions. These functions use null terminated char/widechar based strings. Use of those strings requires a bit of memory management, and if you use the functions extensively, your code may end up looking like a complete mess. That's why I decided to wrap these two functions for use with the more coder-friendly CString types.

The conversion functions


CStringA UTF16toUTF8(const CStringW& utf16)
   CStringA utf8;
   int len = WideCharToMultiByte(CP_UTF8, 0, utf16, -1, NULL, 0, 0, 0);
   if (len>1)
      char *ptr = utf8.GetBuffer(len-1);
      if (ptr) WideCharToMultiByte(CP_UTF8, 0, utf16, -1, ptr, len, 0, 0);
   return utf8;


CStringW UTF8toUTF16(const CStringA& utf8)
   CStringW utf16;
   int len = MultiByteToWideChar(CP_UTF8, 0, utf8, -1, NULL, 0);
   if (len>1)
      wchar_t *ptr = utf16.GetBuffer(len-1);
      if (ptr) MultiByteToWideChar(CP_UTF8, 0, utf8, -1, ptr, len);
   return utf16;

Using the code

Use of the two helper functions is straightforward. But, do note that they are only useful if your project is set to use the UNICODE character set. The functions also only work in Visual Studio 7.1 or above. If you use Visual Studio 6.0, you won't be able to compile because you miss CStringA and CStringW. In the following code snippet, you have a usage example:

CStringW utf16("òèçùà12345");
CStringA utf8 = UTF16toUTF8(utf16);
CStringW utf16_2 = UTF8toUTF16(utf8);


After a comment by Ivo Beltchev, I decided to change the functions as he suggested. Initially, I designed the functions like this:

CStringA UTF16toUTF8(const CStringW& utf16)
  LPSTR pszUtf8 = NULL;
  CStringA utf8("");

  if (utf16.IsEmpty()) 
    return utf8; //empty imput string

  size_t nLen16 = utf16.GetLength();
  size_t nLen8 = 0;

  if ((nLen8 = WideCharToMultiByte (CP_UTF8, 0, utf16, nLen16, 
                                    NULL, 0, 0, 0) + 2) == 2)
    return utf8; //conversion error!

  pszUtf8 = new char [nLen8];
  if (pszUtf8)
    memset (pszUtf8, 0x00, nLen8);
    WideCharToMultiByte(CP_UTF8, 0, utf16, nLen16, pszUtf8, nLen8, 0, 0);
    utf8 = CStringA(pszUtf8);

  delete [] pszUtf8;
  return utf8; //utf8 encoded string

CStringW UTF8toUTF16(const CStringA& utf8)
  LPWSTR pszUtf16 = NULL;
  CStringW utf16("");
  if (utf8.IsEmpty()) 
    return utf16; //empty imput string

  size_t nLen8 = utf8.GetLength();
  size_t nLen16 = 0;

  if ((nLen16 = MultiByteToWideChar (CP_UTF8, 0, utf8, nLen8, NULL, 0)) == 0)
    return utf16; //conversion error!

  pszUtf16 = new wchar_t[nLen16];
  if (pszUtf16)
    wmemset (pszUtf16, 0x00, nLen16);
    MultiByteToWideChar (CP_UTF8, 0, utf8, nLen8, pszUtf16, nLen16);
    utf16 = CStringW(pszUtf16);

  delete [] utf16;
  return utf16; //utf16 encoded string

These functions work just as well, but the latter versions are smaller and a bit optimized. Thanks to Ivo for the observation!


This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


About the Author

John Paul Pirau
Software Developer (Senior)
Romania Romania
No Biography provided

You may also be interested in...

Comments and Discussions

QuestionThere no need to adjust len for GetBuffer! Pin
Theo Buys15-Mar-17 4:02
memberTheo Buys15-Mar-17 4:02 
BugSmall bug ... Pin
Tomice14-Jan-15 23:26
memberTomice14-Jan-15 23:26 
QuestionConsider using CA2T and CT2A Pin
kanalbrummer24-Nov-13 4:11
memberkanalbrummer24-Nov-13 4:11 
AnswerRe: Consider using CA2T and CT2A Pin
Theo Buys13-Apr-15 6:15
memberTheo Buys13-Apr-15 6:15 
QuestionTraditional Chinese characters aren’t being read from network stream Pin
Member 864850822-Mar-12 18:53
memberMember 864850822-Mar-12 18:53 
GeneralMy vote of 3 Pin
Dezhi Zhao13-Jan-11 5:55
memberDezhi Zhao13-Jan-11 5:55 
GeneralEven more elegant! Pin
Elmue23-Aug-08 11:37
memberElmue23-Aug-08 11:37 
GeneralRe: Even more elegant! [modified] Pin
John Paul Pirau4-Sep-08 3:25
memberJohn Paul Pirau4-Sep-08 3:25 
First of all thank you very much for taking your time to answer to my article. But please .. mr Expert.. double check what you post before you do it. Your post has many flaws comming from both erronous thought and most of all lack of knowledge.

by the way .. UTF = Unicode Transformation Format

Elmue wrote:
You know in advance that the UTF string will NEVER be longer than 4 times the Unicode string

..both are unicode. You should use UTF16 and UTF8 respectively to avoid confusion

Elmue wrote:
You know in advance that the UTF string will NEVER be longer than 4 times the Unicode string:

The conversion is between UTF16 and UTF8.. a UTF16 char is always represented on 2 bytes while an UTF8 has a variable length of 1 to 4 bytes for each character. wich brings us to the next point..

Elmue wrote:
It is not necessary to let the Windows API do the entire conversion twice.

You are wrong again.. Please read the documentation on WideCharToMultiByte found on MSDN. But maybe you don't have time.. so let me help you with a quote :

    [in] Size, in bytes, of the buffer indicated by lpMultiByteStr. If this parameter is set to 0, the function returns the required buffer size for lpMultiByteStr and makes no use of the output parameter itself.

even though it's not very intuitive and definitely not optimal .. you must call this function twice. First to find out the length, and then to make the conversion.

And please.. before posting any code please ensure that at least it compiles first:

CStringA UTF16toUTF8(const CStringW& utf16)
   CStringA utf8;
   char *ptr = utf8.GetBuffer(utf16.GetLength()*4);
   if (ptr) WideCharToMultiByte(CP_UTF8, 0, utf16, -1, ptr, len, 0, 0);
   return utf8;

where is len defined here?

I don't want to make refferences to your programming skill.. you've proved enough by yourself Smile | :) . And please more carefull when posting a reply next time.

Thank you!

modified on Thursday, September 4, 2008 8:31 AM

GeneralYes it is more elegant and it works! Pin
Elmue9-Sep-08 17:05
memberElmue9-Sep-08 17:05 
GeneralEven more elegant? Pin
Theo Buys15-Mar-17 6:40
memberTheo Buys15-Mar-17 6:40 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Terms of Use | Mobile
Web04 | 2.8.171207.1 | Last Updated 16 May 2008
Article Copyright 2008 by John Paul Pirau
Everything else Copyright © CodeProject, 1999-2017
Layout: fixed | fluid