Click here to Skip to main content
15,868,016 members
Please Sign up or sign in to vote.
5.00/5 (1 vote)
See more:
Hi,
I have a sequence of UTF16 code units which I need to convert into characters.
Example: ‘004204240031’ converts to “BФ1” (0042=’B’, 0424=’Ф’, and 0031=’1’) which I need to display in say MessageBox.
Do I need to separate the sequence to four digits parts and convert them each by each or there is an API which I can use to convert whole sequence at one?
Code snippet would be of great help. Thanks in advance.
P.S.
I use MFC in VS 6.0.
Posted
Comments
Code-o-mat 21-May-12 15:40pm    
You mean you have a string representation of the byte values or you have the byte values themselfs, so basicly, you have an utf16 string and you want to convert that to ASCII or somesuch?
Sergey Alexandrovich Kryukov 21-May-12 17:40pm    
What do you mean by "characters"? UTF16 code is a sequence of 16-bit words, each representing either a single wide character, or (rarely) two word (surrogate pair) representing a single character with a code point above BOM. That's it.
--SA

I agree that by using any of the tables, you need more memory, so what?
But using any of the tables, you will save considerable time searching for and obtaining the desired sub-string
For example the complexity of the hashtable = O(1)! Is it bad?
http://en.wikipedia.org/wiki/Search_data_structure#Asymptotic_amortized_worst-case_analysis[^]
For find () of map, the complexity is O (logn). I think it's enough good result. Isn't it?

You certainly can use a database or/and XML,TXT file.
But not sure it will be a good solution.

Good luck.

Alex.
 
Share this answer
 
You can use the library function WideCharToMultiByte()[^]. I also wrote a Tip[^] about this that may help you.
 
Share this answer
 
You don't need to convert anything. You do need a function that correctly displays it. You may use, for instance MessageBoxW:
C++
const unsigned short s[] = { 0x0042, 0x0424, 0x0031, 0x0000};

MessageBoxW(NULL, (LPCWSTR) s, L"Test", MB_OK);
 
Share this answer
 
Hi Jisip.
1) To solve your problem, you can use Lookup table.
Please see here:
http://en.wikipedia.org/wiki/Lookup_table[^]

http://stackoverflow.com/questions/1751696/look-up-tables-in-c[^]

and here
http://martin-bell.suite101.com/ascii-codes-for-common-keyboard-characters-a127559[^]

If too many digits, then you can divide them into four parts, then get some symbols from the lookuptable. At the end you will have to "glue" these symbols.
Ok?

2)another solution is to use hashtable
The STL has a map which has the same functionality as hashtable.
Map is usually based on a red-black tree algorithm.

For example :
XML
#include <map>

map<long,string> myMap;



Regards,
Alex.

P/S. Don't forget to vote if it helps you
 
Share this answer
 
Comments
josip cagalj 22-May-12 2:47am    
Thanks for your input.
To clarify I have a string “0449011100320110” which I need to convert from hex (two byte is used to represent a character) into ASCII -> “щđ2Đ”.
See here: http://textmechanic.com/ASCII-Hex-Unicode-Base64-Converter.html if you put “0449011100320110” (watch for delimiters) into bottom box and click on ‘Decode Hex into ASCII’ button the upper box will display converted string ‘щđ2Đ’. That is what I need.
If I use a lookup table I would need to populate it myself first (hardcode I presume, and look how many there are: http://www.columbia.edu/kermit/ucs2.html ) so I could seek for match.
Any further suggestion?

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900