Click here to Skip to main content
15,893,722 members
Articles / Programming Languages / C#
Article

Encoding Accented Characters

Rate me:
Please Sign up or sign in to vote.
3.77/5 (12 votes)
22 May 2007CPOL 102.5K   20   3
There is a problem exporting accented characters in plain text files. You need to encode, but which one?

Introduction

There is a problem exporting accented characters in text files. Some programs cannot import or correctly display accented characters. Therefore you need to use encoding to correctly export a plain text file. However, there are a LOT of encodings, so which one should you use?

Here's How

The answer is: iso-8859-8.

That is the Hebrew (ISO-Visual) encoding. The encoding is natively supported in .NET. It intelligently converts to a visual format for you. The other standard encoders do not do this as you will see below.

Example

Converting the following: Frédéric François.

Encoding Description Output
ASCII Fr?d?ric Fran?ois
Default Frédéric François
UTF7 Unicode (UTF-7) Fr+AOk-d+AOk-ric Fran+AOc-ois
UTF8 Unicode (UTF-8) Frédéric François
iso-8859-1 Western European (ISO) Frédéric François
iso-8859-8 Hebrew (ISO-Visual) Frederic Francois
us-ascii US-ASCII Fr?d?ric Fran?ois
Windows-1252 Western European (Windows) Frédéric François

Example of Code Using Encoding

C#
StreamWriter sw = new StreamWriter
    ("somefile.txt", false, System.Text.Encoding.GetEncoding("iso-8859-8"));

A Full Example for the Beginner

C#
using (StreamWriter sw = new StreamWriter
    ("somefile.txt", false, System.Text.Encoding.GetEncoding("iso-8859-8")))
{
    DataSet1TableAdapters.binsTA ta = new DataSet1TableAdapters.binsTA();
    DataSet1.binsDataTable dt = ta.GetData();
    foreach (DataSet1.binsRow row in dt.Rows)
    {
        sw.Write(row.ID.ToString());
        sw.Write("|");
        sw.WriteLine(row.description);
    }
}

History

  • 22nd May, 2007: Initial post

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior) www.ByBox.com
United Kingdom United Kingdom
C++ and C# Developer for 21 years. Microsoft Certified.

UK Senior software developer / team leader.

I've been writing software since 1985. I pride myself on designing and creating software that is first class. That means it has to be fast, scalable, and with good use of design patterns.

I have done everything from risk analysis and explosion modelling, banking systems, to highly scalable multi-threaded arrival and departure screens in many leading airports, to state of the art wireless warehouse systems.

Comments and Discussions

 
QuestionEncoding Accented Characters (String) Pin
Member 1056399925-Mar-14 22:39
Member 1056399925-Mar-14 22:39 
GeneralDoesn't work for me [modified] Pin
Lord of Scripts27-Aug-08 3:15
Lord of Scripts27-Aug-08 3:15 
I have been trying that in a method to simply convert a string parameter into a "visually correct" string where none of these weird characters appear, but instead into something equivalent (as in the plain Frederique XYZ without accents). However, I don't get the right results.

In particular I wanted to weed out all accented characters used in languages such as Spanish, French, German, Czech, etc.

Perhaps a sample method to convert a "dirty" string to a "visual" string would be a good idea.

http://www.PanamaSights.com/
http://www.coralys.com/
http://www.virtual-aviation.info/

modified on Wednesday, August 27, 2008 11:08 AM

GeneralRe: Doesn't work for me Pin
seanicongroup7-May-09 11:18
seanicongroup7-May-09 11:18 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.