Click here to Skip to main content
Sign Up to vote bad
good
See more: C++C++/CLIC
Hello ,

I have a Datasetconatins data in greek character set for e.g.
Πριγκηποννήσου
Κομνηνών
Komninon
Διαγόρα
Diagora
Καλλιόπης
Kalliopis
Πάροδος Νίκαιας
Parodos Nikaias

Which got converted to ??????? while saving from C# string object to unsigned char in C.

Same issue with Russian language.

Please suggest some thoughts on the same.

Thanks,
Sachin
Posted 27 Dec '12 - 20:38

Comments
Sergey Alexandrovich Kryukov - 28 Dec '12 - 2:53
There is no differences between languages, in this respect. .NET itself uses Unicode. It never looses any text data, you do. There are many ways to screw up things, including Unicode text. You need to show some code sample manifesting your problem, if you need help. In the meanwhile, I tell you: use only Unicode and Unicode UTFs for encoding (typically UTF-8), and nothing else. This is the key. —SA
Sergey Alexandrovich Kryukov - 28 Dec '12 - 3:02
I see. Just use a tiny bit of logic... —SA

1 solution

I just noticed: you say that you save the text in C unsigned char. What do you want then?!
 
Unicode code points in the subset covering Greek and Russian require up to 16 bits per character. The Unicode subset of first 216 code points is called Base Multilingual Plane. Other code points require more then 16 bits, which is also supported by Unicode. All Unicode UTFs support full set of code points; usually it's UTF-8, UTF-16 and, rarely, UTF-32. (No, UTF-8 does not mean 8 bit per character; UTF-16 does not mean 16 bits per characters; character size is variable.) Internally, .NET represents strings as UTF-16LE.
 
C char is only 8 bits. No way. Do something more reasonable. Forget about your legacy 8-bit encodings, they are gone.
 
If you have a particular problem, feel free to ask a follow-up question.
 
—SA
  Permalink  

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
Your Filters
Interested
Ignored
     
0 Sergey Alexandrovich Kryukov 498
1 Arun Vasu 275
2 Maciej Los 273
3 Mahesh Bailwal 264
4 Aarti Meswania 175
0 Sergey Alexandrovich Kryukov 9,660
1 OriginalGriff 7,329
2 CPallini 3,968
3 Rohan Leuva 3,339
4 Maciej Los 2,851


Advertise | Privacy | Mobile
Web02 | 2.6.130516.1 | Last Updated 28 Dec 2012
Copyright © CodeProject, 1999-2013
All Rights Reserved. Terms of Use
Layout: fixed | fluid