Click here to Skip to main content
11,478,896 members (60,818 online)
Rate this: bad
good
Please Sign up or sign in to vote.
See more: C# VB
Hi guys,

I'm converting an old VB application into C#, and part of the system requires me to convert certain special characters to their ASCII equivalent.

In VB, the code is:

 
sValue = Asc("œ")  'which gives 156
 
sValue = Asc("°")  'which gives 176
 
sValue = Asc("£")  'which gives 163
 

These are the correct values according to http://www.ascii-code.com/.


But when doing the same conversion in C#, the first of these values gives a strange answer.

Here is the code:

 
As ints:
 
int i1 = (int)Convert.ToChar("œ");    // which gives 339

int i2 = (int)Convert.ToChar("°");    // which gives 176

int i3 = (int)Convert.ToChar("£");    // which gives 163

 
As bytes:
 
byte i1 = (byte)Convert.ToChar("œ");    // which gives 83

byte i2 = (byte)Convert.ToChar("°");    // which gives 176

byte i3 = (byte)Convert.ToChar("£");    // which gives 163

 

What gives?! Frown | :( I'm suspecting it's something to do with the sign bit, but I can't see what.

Many thanks
Posted 27-Dec-12 6:35am
Comments
Sergey Alexandrovich Kryukov at 27-Dec-12 22:23pm
   
Who told you it should be ASCII? ASCII won't work for you...
—SA
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 3

Richard is right. To get the same bytes in C# as the bytes in VB, use this:
byte i1 = Encoding.Default.GetBytes("œ")[0];
The GetBytes method returns a byte array, with Encoding.Default.GetBytes("œ")[0] you get the first value of the byte array.

Hope this helps.
  Permalink  
v2
Comments
Nick Fisher (Consultant) at 28-Dec-12 5:45am
   
Yes, this works now. Many thanks. Nick
ProgramFOX at 28-Dec-12 7:45am
   
You're welcome!
Deki syahputra at 19-Apr-15 21:38pm
   
Many Thanks Bro.
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 4

Hello Nick,

What you refer to as being ASCII is *not* ASCII (see http://en.wikipedia.org/wiki/ASCII[^]).
Only the 7-bit ASCII character encoding is unambiguously given.

There exist several 8-bit extensions to the original 7-bit encoding.

Your page claims to list œ as being part of latin-1. But reding carefully, the page says

[...] The extended ASCII codes (character code 128-255)
There are several different variations of the 8-bit ASCII table. The table below is according to ISO 8859-1, also called ISO Latin-1. Codes 129-159 contain the Microsoft® Windows Latin-1 extended characters. [...]


Microsoft decided some years ago to "modify" the standard to fit their needs. See http://www.cs.tut.fi/~jkorpela/chars.html[^] or more specific on http://www.cs.tut.fi/~jkorpela/chars.html#win[^].

Standard Latin-1 does *not* contain œ. That is included in Latin-9 (also known as ISO/IEC-8859-15), see also ISO Latin 9 as compared with ISO Latin 1[^] and http://en.wikipedia.org/wiki/ISO/IEC_8859-15[^].

Now, how to solve your issue?
Neither latin-1 nor latin-9 works on Windows.
You need to take Encoding.GetEncoding(1252) which happens to be the same result as calling Encoding.Default (as ProgramFOX[^] described in Solution #3).

Cheers
Andi
  Permalink  
Comments
Sergey Alexandrovich Kryukov at 27-Dec-12 22:19pm
   
Exactly. This is some legacy trash called "extended ASCII". Practically, none of the modern systems support it, for a good reason.
Unicode representation of these characters should be used, that's it.
My 5.
—SA
Andreas Gieriet at 27-Dec-12 22:43pm
   
Hello Sergey,
thanks for your 5!
Cheers
Andi
Nick Fisher (Consultant) at 28-Dec-12 5:45am
   
Excellent answer, thanks. Nick
Andreas Gieriet at 28-Dec-12 8:01am
   
You are welcome!
Andi
Espen Harlinn at 28-Dec-12 7:40am
   
Good guess, a 5 :-D
Andreas Gieriet at 28-Dec-12 8:01am
   
Hello Espen,
thanks for your 5!
Cheers
Andi
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 2

Use the GetBytes[^] of the Encoding.ASCII[^] encoding to get the characters converted to ascii.

Best regards
Espen Harlinn
  Permalink  
Comments
Andreas Gieriet at 27-Dec-12 21:24pm
   
Hello Espen,
this would remove diacritics by mapping the windows code page 1252 characters to 7-bit ASCII instead of converting to unicode encoding. See also Solution #3 and #4.
Cheers
Andi
Espen Harlinn at 28-Dec-12 7:39am
   
OP asked for ASCII, repeatedly ...

And as you wrote in your answer - you're doing a conversion to code page 1252, which is what OP actually needed, but it wasn't what he asked for.
Andreas Gieriet at 28-Dec-12 8:06am
   
Hello Espen,
I focussed more on his example code and felt that asking for ASCII ist wrong...
It's interesting though, that converting to ASCII results in removing diacritics (œ --> o) - that was new to me.
Cheers
Andi
Sergey Alexandrovich Kryukov at 27-Dec-12 22:21pm
   
Sorry, but won't work in this case. You probably answered formally, but did not look at the characters themselves. Please see the correct solution #4 and my comments.
(I did not vote this time.)
—SA
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 1

C# uses Unicode rather than ASCII to represent characters and strings.
  Permalink  
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 5

byte i1 = (byte)Convert.ToChar("œ");

C# uses unicode and unicode of 'œ' is 339 in both cases. (byte and int)

As we know that the range of this byte is from 0-255 so it can't hold as it is unsigned in C# but unicode of "œ" character is 339 so the Unicode value is overflowing range of byte. But as we are not concerned with overflow or underflow So there exists a pattern on which overflow value is stored in byte
Range of byte = 2^8 = 256
In case of overflow : (339 - 256 = 83 )
Now 83 is storing in a byte.

There is a way to check overflow and underflow.
byte i1 = checked((byte)Convert.ToChar("œ"));
Now using checked you will get a runtime exception which is System.overflow exception.
And u know Exception handling ......!
  Permalink  

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
0 Sergey Alexandrovich Kryukov 200
1 F-ES Sitecore 195
2 Frankie-C 120
3 OriginalGriff 105
4 DamithSL 95
0 Sergey Alexandrovich Kryukov 7,880
1 OriginalGriff 7,341
2 Sascha Lefèvre 3,064
3 Maciej Los 2,491
4 Richard Deeming 2,335


Advertise | Privacy | Mobile
Web01 | 2.8.150520.1 | Last Updated 23 Mar 2015
Copyright © CodeProject, 1999-2015
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100