Click here to Skip to main content
15,867,686 members
Articles / Web Development / ASP.NET
Article

Using Character Encoding in ASP.NET

Rate me:
Please Sign up or sign in to vote.
4.08/5 (19 votes)
15 Apr 20042 min read 329.1K   2.2K   57   30
Submit your data to the OS or database server which does not support the character set.

Introduction

This sample shows how to put your text completely into the operation system or database which does not support the text's character set.

Background

In web application development, frequently I have to connect each kind of old database and OS from customers such as SCO5.05 which does not support UTF-8, GB2312 and other character sets. So, how to completely store or take out my text data has became an important job.

Once on a project, I needed to put some Chinese words in UTF-8 into the system of SCO5.05 + Informix7.3. But when I check the database, found that all characters were changed into "->" (\0x7F) in fact. Many ways I tried, but ever could not solve this problem.

Why?

I found the answer later: this is the trouble of character encoding.

Open the file named web.config in the ASP.NET project. The value of requestEncoding attribute in globalization element is "utf-8". It means the requested texts were encoded as UTF-8 character set. Because SCO5.05 does not support UTF-8, therefore the requested texts where changed.

I got it. The texts should be encoded into the western language (iso8859-1) which SCO5.05 can distinguish from UTF-8 before saving, and converted back after loading.

Solution code

For example, to put the message "ÄãºÃPi(\u03a0)", means "Hello Pi(¦°)", into "memo" field of database, use the following code:

C#
// message "Hello pi(¦°)" in Chinese
string unicodeStr = "ÄãºÃPi(\u03a0)";

OdbcConnection conn = new OdbcConnection();
System.Data.IDbCommand cmd = conn.CreateCommand();

conn.ConnectionString = "your connection string";
cmd.Connection = conn;

// Encoding here
cmd.CommandText = "INSERT INTO encoding VALUES ('" 
  + CEncoding.unicode_iso8859(unicodeStr) + "')";
cmd.Connection = conn;
conn.Open();
cmd.ExecuteNonQuery();
conn.Close();

I used the function unicode_iso8859() above. It can convert the texts from UTF-8 to ISO8859-1.

C#
public static string unicode_iso8859(string src) {
  Encoding iso = Encoding.GetEncoding("iso8859-1");
  Encoding unicode = Encoding.UTF8;
  byte[] unicodeBytes = unicode.GetBytes(src);
  return iso.GetString(unicodeBytes);
}
C#
public static string iso8859_unicode(string src) {
  Encoding iso = Encoding.GetEncoding("iso8859-1");
  Encoding unicode = Encoding.UTF8;
  byte[] isoBytes = iso.GetBytes(src);
  return unicode.GetString(isoBytes);
}

Select your database and take a look. Is that all the texts converted into ISO symbol which you do not recognized?

Then you can convert back reversely by using iso8859_unicode() function. Of course, you can convert back with other encodings as you want.

If you are using an adapter and binding a DataSet to a DataGrid, it is easy to encode the data with these two methods, too. But you will pay the cost of more time. Use it or not? It is under your own judgment. J

C#
OdbcAdapter adapter = new OdbcAdapter(); 
DataSet1 ds = new DataSet1(); 
DataGrid grid = new DataGrid(); 
OdbcConnection conn = new OdbcConnection(); 

// conn, adapter, dataset and datagrid were initialized 
conn.ConnectionString = "your connection string"; 
adapter.Connection = conn;
adapter.Fill(ds);

string xml = ds.GetXml();
ds.Clear();

// encoding here
ds.ReadXml(new System.IO.StringReader(CEncoding.iso8859_unicode(xml)));
grid.DataBind();

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


Written By
Team Leader
China China

Comments and Discussions

 
GeneralMy vote of 2 Pin
shankar.koppella7-Apr-10 0:08
shankar.koppella7-Apr-10 0:08 
Generalunicode_iso8859 method returns nulls for some Unicode chars, this works better Pin
maestroboomer31-Jan-08 14:29
maestroboomer31-Jan-08 14:29 
AnswerRe: unicode_iso8859 method returns nulls for some Unicode chars, this works better Pin
Gokhan Mamaci11-May-09 4:54
professionalGokhan Mamaci11-May-09 4:54 
QuestionProblem in displaying devanagri characters stored in postgres database Pin
Simmi Kapoor3-Apr-07 19:13
Simmi Kapoor3-Apr-07 19:13 
GeneralI need help Pin
realnaimi26-Aug-06 23:03
realnaimi26-Aug-06 23:03 
QuestionIs there other workaround other than using your approach? Pin
kpchan221-Jun-06 1:01
kpchan221-Jun-06 1:01 
I have a database which is configured with locale en_US.850.
Understand that characters are not probably stored in the database with the correct encoding format. But using classic ASP page, input Chinese (Big5) words are directly inserted into the database without any conversion (i.e. garbage in garbage out). So, those data is viewed probably from another classic ASP page, as long as the IE browser is using the BIG5 encoding.

However, problem comes when I changed my application to use ASP.NET. In .NET, it is not garbage in garbage out. As the data is not probably encoded and stored in the DB, when the front-end show the data, ASP.NET will encode it to UTF-8 by default. That will make the data wrongly encoded. Even I change the responseEncoding in the web.config file to BIG5, it don't work as well, again, the data is not probably encoded.

I have considered your solution. But the point is there have been a huge data set that are stored inside which is not iso-8859-1 encoded, and it is not possible to convert them at this moment.

Is there workaround for this situation, provided that the database locale remains unchanged? D'Oh! | :doh:


cupsnoodles
AnswerRe: Is there other workaround other than using your approach? Pin
Steven M Hunt27-Mar-07 6:26
Steven M Hunt27-Mar-07 6:26 
QuestionChinese characters Pin
mrajanikrishna19-May-06 0:07
mrajanikrishna19-May-06 0:07 
QuestionRe: Chinese characters Pin
Samuel Chen19-May-06 21:12
Samuel Chen19-May-06 21:12 
AnswerRe: Chinese characters Pin
mrajanikrishna19-May-06 22:12
mrajanikrishna19-May-06 22:12 
AnswerRe: Chinese characters Pin
Samuel Chen20-May-06 0:24
Samuel Chen20-May-06 0:24 
GeneralRe: Chinese characters Pin
mrajanikrishna21-May-06 15:05
mrajanikrishna21-May-06 15:05 
GeneralThank you! Pin
quitchat5-Apr-06 6:57
quitchat5-Apr-06 6:57 
Generalhelp Pin
viketo21-Mar-05 6:50
viketo21-Mar-05 6:50 
GeneralRe: help Pin
Samuel Chen22-Mar-05 16:08
Samuel Chen22-Mar-05 16:08 
GeneralRe: help Pin
Anonymous23-Mar-05 17:24
Anonymous23-Mar-05 17:24 
GeneralRe: help Pin
Samuel Chen13-Apr-05 17:23
Samuel Chen13-Apr-05 17:23 
QuestionISO 8859-6 Pin
Nisha G20-Mar-06 19:23
Nisha G20-Mar-06 19:23 
AnswerRe: ISO 8859-6 Pin
Samuel Chen22-Mar-06 19:11
Samuel Chen22-Mar-06 19:11 
GeneralGreat stuff !!! Pin
Yovav23-Apr-04 10:09
Yovav23-Apr-04 10:09 
GeneralRe: Great stuff !!! Pin
Samuel Chen24-Apr-04 6:45
Samuel Chen24-Apr-04 6:45 
GeneralRe: Great stuff !!! Pin
Yovav24-Apr-04 7:08
Yovav24-Apr-04 7:08 
GeneralRe: Great stuff !!! Pin
Samuel Chen24-Apr-04 8:26
Samuel Chen24-Apr-04 8:26 
GeneralRe: Great stuff !!! Pin
Yovav24-Apr-04 9:32
Yovav24-Apr-04 9:32 
GeneralRe: Great stuff !!! Pin
Samuel Chen24-Apr-04 17:29
Samuel Chen24-Apr-04 17:29 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.