Click here to Skip to main content
15,895,256 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
I am trying to add 3 control characters to a string. The following is the code

String ackMessage = splitted[0];
                    String[] msh = ackMessage.Split('|');
                    String controlID = msh[9];
                    StringBuilder sb = new StringBuilder();
                    sb.Append((byte) 0x0b);
                    sb.AppendLine(ackMessage);
                    sb.Append("MSA|AA|" + controlID);
                    sb.Append((byte) 0x1c);
                    sb.Append((byte) 0x0d);
                    string myMessage = sb.ToString();


I want the sp.append byte commands to be the actuall ascii character code. Can some tell me how to do this in C#?

Thank you.
Posted
Comments
Sergey Alexandrovich Kryukov 28-Feb-12 13:55pm    
Is it a problem or a question?
--SA

Truly, if you want to deal with ASCII you are better off not using a string at all - use a byte array instead. That way there are no awkward conversions either to or from Unicode. If you start including ASCII values in your string, they will be converted to the Unicode equivalent first, and that may cause more problems than it fixes.
 
Share this answer
 
I don't see a problem in what you do. It should work. I only doubt that you need any byte commands in a string. If they are bytes, keep them in an array of byte, not a string. Also, you should understand what going on, to avoid surprises.

Characters and strings are not bytes. Internally, they are Unicode characters encoded as UTF-16. It has nothing to do with ASCII, by as ASCII is a subset of Unicode, 0x1C actually represented as UTF-16 0x001C. This encoding represent all characters in BMP (Base Multilingual Plane) as 16-bit words, and the characters beyond BMP — as pairs of 16-bit words called surrogate pairs; and the words of pairs are from the special ranges of Unicode code points standardized for this purpose. All UTFs are supported only when you serialize text into arrays of byte using the classes based on System.Text.Encoding, please see:
http://msdn.microsoft.com/en-us/library/system.text.encoding.aspx[^].

In particular, UTF-8 is a byte-oriented encoding which takes variable number of bytes to represent a single character. However, if all of your text is composed of characters from the ASCII subset of the Unicode, UTF-8 encoding will produce the array of bytes equivalent to ASCII, strictly one byte per character. You can use it in your code.

You should clearly understand that Unicode is not a 16-bit or 32-bit code. It standardizes one-to-one correspondence between characters understood as cultural entities regardless their graphical representations, fonts or something to integers, understood in their abstract mathematical meaning, regardless their computer representation, bit size, endianess of anything like that. All the details of computer representation are defined by UTFs.

Please see:
http://unicode.org/[^],
http://unicode.org/faq/utf_bom.html[^].

—SA
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900