Click here to Skip to main content
Click here to Skip to main content
Go to top

SMTP Internationalization

, 13 Jul 2003
Rate this:
Please Sign up or sign in to vote.
How to send non-English e-mail using .NET.

Introduction

You can find many articles dedicated to C# SMTP implementation on this or other sites. I'm not going to stop on protocol implementation details but rather on the issue of sending e-mail in languages other than English (I'd use Russian in our scenario). English-only based e-mail messaging systems use 7-bit System.Text.Encoding.ASCII encoding when text has to be converted to sequence of bytes for network transmission. All such applications convert any non-English characters (hex codes 0x80-0xFF) into '?' meaning that there is no proper character representation.

Solution

Simple solution to this problem is to use System.Text.Encoding instance that corresponds to source text encoding scheme. Source character set would usually correspond to one set in Control Panel/Regional Settings.

I use Russian as my default language, so that all Cyrillic characters appear properly inside text areas and on title bars. Apparently, there is an easy way to find out what default encoding scheme is used by Windows:

System.Text.Encoding sourceEncoding = System.Text.Encoding.Default;

A little test:

Console.WriteLine( "Windows charset: " + sourceEncoding.HeaderName );
Console.WriteLine( "Windows code page: " + sourceEncoding.CodePage );

would reveal that we are on the right way:

> Windows charset: Windows-1251

> Windows code page: 1251

Now, e-mail can be properly encoded for transmission. We'd just need to add character set identifier to message header:

text.AppendFormat( "Content-Type: text/plain;\r\n\tcharset=\"{0:G}\"\r\n", 
    sourceEncoder.HeaderName );

where text is a StringBuffer variable containing resulting text. Message body would be transmitted like this:

byte[] data = sourceEncoding.GetBytes( text.ToString() );
smtpStream.Write( data, 0, data.Length );

That would be all, but in real world, not everything is that simple. By historical reasons, Russian speaking countries use KOI-8 encoding as de-facto e-mail standard (not everyone is using Windows and accordingly code page 1251 might not be supported on some DOS or UNIX systems). That's why I set my default e-mail encoding in Outlook Express to KOI-8 (Options/Send), so I'd be able to chat with 'non-Windows' buddies:

Some investigation reveals that this value is also present in default encoding object:

Console.WriteLine( "E-Mail charset: " + sourceEncoding.BodyName );

> E-Mail charset: koi8-r

Luckily, there is a static function System.Text.Encoding.Convert() that can convert text from one encoding scheme to another. Here is a snippet of code that must be implemented before the message is sent. Don't forget that resulting code page will be different now, so 'Content-Type' charset header must refer to sourceEncoding.BodyName.

Using System.Text;
/* ............ */
Encoding srcEnc = Encoding.Default; 
Encoding dstEnc;

/* src & dst refer to same object if no intermediate conversion is required */
if( srcEnc.HeaderName.Equals( srcEnc.BodyName ) )
  dstEnc = srcEnc;
else
  dstEnc = Encoding.GetEncoding( srcEnc.BodyName );

/* ............ */
byte[] srcData = srcEnc.GetBytes( messageString );
byte[] dstData;

/* see if we need to convert data */
if( dstEnc != srcEnc )
  dstData = Encoding.Convert( srcEnc, dstEnc, srcData );
else
  dstData = srcData;

/* write encoded data */
smtpStream.Write( dstData, 0, dstData.Length );

That's all, folks. Latest version of the SMTP library source code and help file can be found here.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

Share

About the Author

Andriy Zolotoiy
Web Developer
Canada Canada
MSc in Applied Mathematics. Been working in IT for 12 years mainly in C++ on both Windows and UNIX. During last year focused on C# and .NET.

Comments and Discussions

 
Questionmail subject in russian Pinmemberzvika562-Jul-09 7:14 
GeneralError "Index was outside the bounds of the array" when sending email "to" with non ASCII chars... PinmemberPetr Stejskal27-Apr-07 3:17 
QuestionWhere is the code? Pinmembersolusoft26-Jan-06 22:01 
GeneralLocalization Pinmemberramprasadsg9-Jan-05 23:02 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web02 | 2.8.140905.1 | Last Updated 14 Jul 2003
Article Copyright 2003 by Andriy Zolotoiy
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid