Click here to Skip to main content
Rate this: bad
Please Sign up or sign in to vote.
See more: C#VB
Hi guys,
I'm converting an old VB application into C#, and part of the system requires me to convert certain special characters to their ASCII equivalent.
In VB, the code is:
sValue = Asc("œ")  'which gives 156
sValue = Asc("°")  'which gives 176
sValue = Asc("£")  'which gives 163
These are the correct values according to

But when doing the same conversion in C#, the first of these values gives a strange answer.
Here is the code:
As ints:
int i1 = (int)Convert.ToChar("œ");    // which gives 339

int i2 = (int)Convert.ToChar("°");    // which gives 176

int i3 = (int)Convert.ToChar("£");    // which gives 163

As bytes:
byte i1 = (byte)Convert.ToChar("œ");    // which gives 83

byte i2 = (byte)Convert.ToChar("°");    // which gives 176

byte i3 = (byte)Convert.ToChar("£");    // which gives 163

What gives?! Frown | :( I'm suspecting it's something to do with the sign bit, but I can't see what.
Many thanks
Posted 27-Dec-12 5:35am
Sergey Alexandrovich Kryukov at 27-Dec-12 22:23pm
Who told you it should be ASCII? ASCII won't work for you... —SA
Rate this: bad
Please Sign up or sign in to vote.

Solution 3

Richard is right. To get the same bytes in C# as the bytes in VB, use this:
byte i1 = Encoding.Default.GetBytes("œ")[0];
The GetBytes method returns a byte array, with Encoding.Default.GetBytes("œ")[0] you get the first value of the byte array.
Hope this helps.
Nick Fisher (Consultant) at 28-Dec-12 5:45am
Yes, this works now. Many thanks. Nick
ProgramFOX at 28-Dec-12 7:45am
You're welcome!
Rate this: bad
Please Sign up or sign in to vote.

Solution 4

Hello Nick,
What you refer to as being ASCII is *not* ASCII (see[^]).
Only the 7-bit ASCII character encoding is unambiguously given.
There exist several 8-bit extensions to the original 7-bit encoding.
Your page claims to list œ as being part of latin-1. But reding carefully, the page says
[...] The extended ASCII codes (character code 128-255)
There are several different variations of the 8-bit ASCII table. The table below is according to ISO 8859-1, also called ISO Latin-1. Codes 129-159 contain the Microsoft® Windows Latin-1 extended characters. [...]

Microsoft decided some years ago to "modify" the standard to fit their needs. See[^] or more specific on[^].
Standard Latin-1 does *not* contain œ. That is included in Latin-9 (also known as ISO/IEC-8859-15), see also ISO Latin 9 as compared with ISO Latin 1[^] and[^].
Now, how to solve your issue?
Neither latin-1 nor latin-9 works on Windows.
You need to take Encoding.GetEncoding(1252) which happens to be the same result as calling Encoding.Default (as ProgramFOX[^] described in Solution #3).
Sergey Alexandrovich Kryukov at 27-Dec-12 22:19pm
Exactly. This is some legacy trash called "extended ASCII". Practically, none of the modern systems support it, for a good reason. Unicode representation of these characters should be used, that's it. My 5. —SA
Andreas Gieriet at 27-Dec-12 22:43pm
Hello Sergey, thanks for your 5! Cheers Andi
Nick Fisher (Consultant) at 28-Dec-12 5:45am
Excellent answer, thanks. Nick
Andreas Gieriet at 28-Dec-12 8:01am
You are welcome! Andi
Espen Harlinn at 28-Dec-12 7:40am
Good guess, a 5 :-D
Andreas Gieriet at 28-Dec-12 8:01am
Hello Espen, thanks for your 5! Cheers Andi
Rate this: bad
Please Sign up or sign in to vote.

Solution 1

C# uses Unicode rather than ASCII to represent characters and strings.
Rate this: bad
Please Sign up or sign in to vote.

Solution 2

Use the GetBytes[^] of the Encoding.ASCII[^] encoding to get the characters converted to ascii.
Best regards
Espen Harlinn
Andreas Gieriet at 27-Dec-12 21:24pm
Hello Espen, this would remove diacritics by mapping the windows code page 1252 characters to 7-bit ASCII instead of converting to unicode encoding. See also Solution #3 and #4. Cheers Andi
Espen Harlinn at 28-Dec-12 7:39am
OP asked for ASCII, repeatedly ... And as you wrote in your answer - you're doing a conversion to code page 1252, which is what OP actually needed, but it wasn't what he asked for.
Andreas Gieriet at 28-Dec-12 8:06am
Hello Espen, I focussed more on his example code and felt that asking for ASCII ist wrong... It's interesting though, that converting to ASCII results in removing diacritics (œ --> o) - that was new to me. Cheers Andi
Sergey Alexandrovich Kryukov at 27-Dec-12 22:21pm
Sorry, but won't work in this case. You probably answered formally, but did not look at the characters themselves. Please see the correct solution #4 and my comments. (I did not vote this time.) —SA

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
Your Filters
0 Guruprasad.K.Basavaraju 424
1 Sergey Alexandrovich Kryukov 326
2 Shai Vashdi 318
3 OriginalGriff 265
4 Abhinav S 160
0 Sergey Alexandrovich Kryukov 9,169
1 OriginalGriff 5,290
2 Peter Leow 4,020
3 Maciej Los 3,535
4 Abhinav S 3,263

Advertise | Privacy | Mobile
Web04 | 2.8.140415.2 | Last Updated 27 Dec 2012
Copyright © CodeProject, 1999-2014
All Rights Reserved. Terms of Use
Layout: fixed | fluid