Click here to Skip to main content
15,075,638 members
Please Sign up or sign in to vote.
4.80/5 (4 votes)
See more:
Hi guys,

I'm converting an old VB application into C#, and part of the system requires me to convert certain special characters to their ASCII equivalent.

In VB, the code is:

sValue = Asc("œ")  'which gives 156

sValue = Asc("°")  'which gives 176

sValue = Asc("£")  'which gives 163

These are the correct values according to

But when doing the same conversion in C#, the first of these values gives a strange answer.

Here is the code:

As ints:

int i1 = (int)Convert.ToChar("œ");    // which gives 339

int i2 = (int)Convert.ToChar("°");    // which gives 176

int i3 = (int)Convert.ToChar("£");    // which gives 163

As bytes:

byte i1 = (byte)Convert.ToChar("œ");    // which gives 83

byte i2 = (byte)Convert.ToChar("°");    // which gives 176

byte i3 = (byte)Convert.ToChar("£");    // which gives 163

What gives?! :( I'm suspecting it's something to do with the sign bit, but I can't see what.

Many thanks
Updated 24-Sep-21 5:08am
Sergey Alexandrovich Kryukov 27-Dec-12 22:23pm
Who told you it should be ASCII? ASCII won't work for you...

Richard is right. To get the same bytes in C# as the bytes in VB, use this:
byte i1 = Encoding.Default.GetBytes("œ")[0];

The GetBytes method returns a byte array, with Encoding.Default.GetBytes("œ")[0] you get the first value of the byte array.

Hope this helps.
Nick Fisher (Consultant) 28-Dec-12 5:45am
Yes, this works now. Many thanks. Nick
Thomas Daniels 28-Dec-12 7:45am
You're welcome!
Deki syahputra 19-Apr-15 21:38pm
Many Thanks Bro.
Hello Nick,

What you refer to as being ASCII is *not* ASCII (see[^]).
Only the 7-bit ASCII character encoding is unambiguously given.

There exist several 8-bit extensions to the original 7-bit encoding.

Your page claims to list œ as being part of latin-1. But reding carefully, the page says

[...] The extended ASCII codes (character code 128-255)
There are several different variations of the 8-bit ASCII table. The table below is according to ISO 8859-1, also called ISO Latin-1. Codes 129-159 contain the Microsoft® Windows Latin-1 extended characters. [...]

Microsoft decided some years ago to "modify" the standard to fit their needs. See[^] or more specific on[^].

Standard Latin-1 does *not* contain œ. That is included in Latin-9 (also known as ISO/IEC-8859-15), see also ISO Latin 9 as compared with ISO Latin 1[^] and[^].

Now, how to solve your issue?
Neither latin-1 nor latin-9 works on Windows.
You need to take Encoding.GetEncoding(1252) which happens to be the same result as calling Encoding.Default (as ProgramFOX[^] described in Solution #3).

Sergey Alexandrovich Kryukov 27-Dec-12 22:19pm
Exactly. This is some legacy trash called "extended ASCII". Practically, none of the modern systems support it, for a good reason.
Unicode representation of these characters should be used, that's it.
My 5.
Andreas Gieriet 27-Dec-12 22:43pm
Hello Sergey,
thanks for your 5!
Nick Fisher (Consultant) 28-Dec-12 5:45am
Excellent answer, thanks. Nick
Andreas Gieriet 28-Dec-12 8:01am
You are welcome!
Espen Harlinn 28-Dec-12 7:40am
Good guess, a 5 :-D
Andreas Gieriet 28-Dec-12 8:01am
Hello Espen,
thanks for your 5!
Use the GetBytes[^] of the Encoding.ASCII[^] encoding to get the characters converted to ascii.

Best regards
Espen Harlinn
Andreas Gieriet 27-Dec-12 21:24pm
Hello Espen,
this would remove diacritics by mapping the windows code page 1252 characters to 7-bit ASCII instead of converting to unicode encoding. See also Solution #3 and #4.
Espen Harlinn 28-Dec-12 7:39am
OP asked for ASCII, repeatedly ...

And as you wrote in your answer - you're doing a conversion to code page 1252, which is what OP actually needed, but it wasn't what he asked for.
Andreas Gieriet 28-Dec-12 8:06am
Hello Espen,
I focussed more on his example code and felt that asking for ASCII ist wrong...
It's interesting though, that converting to ASCII results in removing diacritics (œ --> o) - that was new to me.
Sergey Alexandrovich Kryukov 27-Dec-12 22:21pm
Sorry, but won't work in this case. You probably answered formally, but did not look at the characters themselves. Please see the correct solution #4 and my comments.
(I did not vote this time.)
C# uses Unicode rather than ASCII to represent characters and strings.
byte i1 = (byte)Convert.ToChar("œ");

C# uses unicode and unicode of 'œ' is 339 in both cases. (byte and int)

As we know that the range of this byte is from 0-255 so it can't hold as it is unsigned in C# but unicode of "œ" character is 339 so the Unicode value is overflowing range of byte. But as we are not concerned with overflow or underflow So there exists a pattern on which overflow value is stored in byte
Range of byte = 2^8 = 256
In case of overflow : (339 - 256 = 83 )
Now 83 is storing in a byte.

There is a way to check overflow and underflow.
byte i1 = checked((byte)Convert.ToChar("œ"));
Now using checked you will get a runtime exception which is System.overflow exception.
And u know Exception handling ......!

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900