Click here to Skip to main content
15,880,503 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi Everyone,

Greetings of the day!

I would like to know that how can we convert character sets from one form to another via C#.net.

For example: I have a text file with UTF-8 characters (ߣ Ј ◙) sets and I would like to convert them into hexa-decimal entities. For reading the text from text file in am using "Streamreader".

Please advice.

Every single comment is helpful and appreciable, please do comment.

Thanks in advance!

Warm regards,
HK
Posted
Comments
Richard MacCutchan 14-Apr-15 4:04am    
Just take the value and format using the hex format type.
Himanshu Kimni 14-Apr-15 5:10am    
Thanks for your comment Richard. Can you please provide me a sample code.
Richard MacCutchan 14-Apr-15 5:21am    
This is not exactly difficult. Read the file as a byte array, iterate through the array displaying each byte in hex by using the appropriate format type (X), as described in the documentation.
Himanshu Kimni 14-Apr-15 5:45am    
string utfString = "déj\xa0(’)";

// Create two different encodings.
Encoding ascii1 = Encoding.ASCII;
Encoding utf = Encoding.UTF8;

// Convert the string into a byte array.
byte[] utfBytes = utf.GetBytes(utfString);

// Perform the conversion from one encoding to the other.
byte[] ascii1Bytes = Encoding.Convert(utf, ascii1, utfBytes);

// Convert the new byte[] into a char[] and then into a string.
char[] ascii1Chars = new char[ascii1.GetCharCount(ascii1Bytes, 0, ascii1Bytes.Length)];
ascii1.GetChars(ascii1Bytes, 0, ascii1Bytes.Length, ascii1Chars, 0);
string ascii1String = new string(ascii1Chars);

// Display the strings created before and after the conversion.
Console.WriteLine("Original string: {0}", utfString);
Console.WriteLine("Ascii converted string: {0}", ascii1String);
Console.ReadLine();

But despite of giving output "déj\xa0(')", it is giving wrong output as "dAcjA (â?T)"
Richard MacCutchan 14-Apr-15 5:51am    
I am not sure what you are trying to do here, but converting UTF characters to ASCII will often leave you with different characters. The problem is that UTF has a larger set than ASCII so more than one UTF character will map to the same ASCII one.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900