Click here to Skip to main content
15,890,186 members
Articles / Web Development / ASP.NET
Article

Translation Web Service in C#

Rate me:
Please Sign up or sign in to vote.
4.65/5 (32 votes)
18 Apr 20044 min read 408.7K   7K   77   74
C# Web Service to translate text using Babelfish.

Introduction

Most people are now aware that most of the world is not English, even though it's very easy to miss this fact when surfing the web, simply because Google gives English results to people searching English, and so you conveniently miss all the pages in German/French/Italian etc.

The popularity of Altavista's famous Babelfish service is therefore hardly surprising - converting text or web pages into other languages is a useful thing to do.

For a while, anyone looking to integrate translation into their app would simply have had to plug in the Babelfish WSDL. Posters to newsgroups were directed to the free service from xmethods, a good source for a variety of web services (SMS, etc.). In fact, the Babelfish WSDL is the 9th hit on Google for WSDL.

So I plugged it into my apps, intranet, extranet and anything else that vaguely looked like it would benefit from a translation service. And life was good.

But one day the service stopped working, apparently for good. So I had to write a replacement. And here it is.

Code

This is a pretty simple job, and can be broken down into the following subtasks:

  1. Get text for translation and encode it into a HTTP POST request
  2. Send the data to the web server, acting in effect as a .NET web browser
  3. Read the response back into a big string
  4. Remove all the HTML and formatting and send the raw translated string back to the client.

So fire up Visual Studio .NET, and create an ASP.NET Web Service, and name it Translation, and add a Translate.asmx file. There are two inputs: the translation mode (e.g., French to English), and the data to be translated (e.g., 'the quick brown fox jumps over the lazy dog'). To make it a plug-in replacement for the old service, I gave my method the same name and parameters as the old one:

C#
[WebMethod]
public string BabelFish(string translationmode, string sourcedata) 
{
}

The translation modes can be found in the source of the page at Babelfish:

C#
readonly string[] VALIDTRANSLATIONMODES = new string[] 
 {"en_zh", "en_fr", "en_de", "en_it", "en_ja", "en_ko", "en_pt", "en_es", 
 "zh_en", "fr_en", "fr_de", "de_en", "de_fr", "it_en", "ja_en", "ko_en", 
 "pt_en", "ru_en", "es_en"};

The code performs validation to check for a valid mode before passing it on to Babelfish. After that, we create a POST request. The syntax for a HTTP POST request looks something like this:

POST /babelfish/tr/ HTTP/1.0
Content-Type: application/x-www-form-urlencoded
Content-Length: 51

lp=en_fr&tt=urltext&intl=1&doit=done&urltext=cheese

It's pretty simple, and if you want, you could use low-level sockets to write the data to the server. Microsoft provides some better ways to do this however, and so we use the HttpWebRequest class, which has lots of built-in features to make it easy to work with HTTP connections.

C#
Uri uri = new Uri(BABELFISHURL);
HttpWebRequest request = (HttpWebRequest) WebRequest.Create(uri);
request.Referer = BABELFISHREFERER;
// Encode all the sourcedata 
string postsourcedata;
postsourcedata = "lp=" + translationmode + 
    "&tt=urltext&intl=1&doit=done&urltext=" + 
HttpUtility.UrlEncode(sourcedata);
request.Method = "POST";
request.ContentType = "application/x-www-form-urlencoded";
request.ContentLength = postsourcedata.Length;
request.UserAgent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)";
Stream writeStream = request.GetRequestStream();
UTF8Encoding encoding = new UTF8Encoding();
byte[] bytes = encoding.GetBytes(postsourcedata);
writeStream.Write(bytes, 0, bytes.Length);
writeStream.Close();
HttpWebResponse response = (HttpWebResponse) request.GetResponse();
Stream responseStream = response.GetResponseStream();
StreamReader readStream = new StreamReader (responseStream, Encoding.UTF8);
string page = readStream.ReadToEnd();

We end up with a string containing the entire Babelfish page. As it stands, this is about 99% noise (HTML tags, Altavista information, etc.), and 1% the translation we were looking for. So we need a regular expression to find the translated text. By looking at the HTML page, you will find the translation is contained between:

HTML
<Div style=padding:10px; lang=fr>translation here</div>

So the required regular expression looks like this (note: while testing my regular expressions, I got lots of help from Regulator):

<Div style=padding:10px; lang=..>((?:.|\n)*?)</div>

This will match the whole <div>...</div> string. This is a fairly complex regular expression, but basically, the . character matches everything, except for newlines, hence the (.|\n) pattern, which means any character (except newlines) or new lines.

The brackets create a matching group, meaning that the text within the brackets (namely the translation) will be put in its own group at index 1 (index 0 contains the whole match).

The ?: pattern suppresses grouping: () normally creates a matching group: in this case, we are only using the pattern to allow for line breaks in long translations.

Finally *? is a lazy regular expression, matching every character up to the first instance of <div>. (If I had used plain *, the expression would be greedy, and would chomp right up to the LAST </div>.)

Here's the code:

C#
Regex reg = new Regex(@"<Div style=padding:10px; lang=..>(.*?)</div>");
MatchCollection matches = reg.Matches(page);
if (matches.Count != 1 || matches[0].Groups.Count != 2) 
{
    return ERRORSTRINGSTART + "The HTML returned from Babelfish " + 
        "appears to have changed. Please check for" + 
        " an updated regular expression" + 
        ERRORSTRINGEND;
}
return matches[0].Groups[1].Value;

And subject to error checking, that's it!

Using it

Download the code, and unzip it somewhere. Add a virtual directory called Translation in IIS. Go to /translate.asmx and click Test, and enter some test data (say 'en_fr', and 'cheese'). If it works, you are ready to use it in your web and Windows Forms applications.

To use it in your app, add a Web Reference to the asmx, to the program you want to use it in; Visual Studio will create a proxy reference for you, which you can then use to perform translation.

Here's some sample code-behind:

C#
namespace test
{
    using System;
    using System.Data;
    using System.Drawing;
    using System.Web;
    using System.Web.UI.WebControls;
    using localhost1; // assuming that's the reference generated
    using System.Web.UI.HtmlControls;

    /// <summary>
    ///     Summary description for WebUserControl1.
    /// </summary>
    public class WebUserControl1 : System.Web.UI.UserControl
    {
        protected System.Web.UI.WebControls.DropDownList ddTranslationMode;
        protected System.Web.UI.WebControls.TextBox txtText;
        protected System.Web.UI.WebControls.Label lblTranslation;
        protected System.Web.UI.WebControls.Button submitButton;

        private void Page_Load(object sender, System.EventArgs e)
        {
            // Put user code to initialize the page here
        }

        protected void submitButton_Click(object sender, System.EventArgs e) 
        {
            string translationMode = 
                this.ddTranslationMode.SelectedItem.Value;
            string translationText = this.txtText.Text.Trim();
            string translation = "";
            try 
            {
                Translate tr = new Translate();
                translation = tr.BabelFish(translationMode,translationText);
            }
            catch (Exception exp) 
            {
                translation = "There was an error accessing the server: " 
                                                             + exp.Message;
            }
            this.lblTranslation.Text = translation;
        }
    }
}

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


Written By
United Kingdom United Kingdom
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
AnswerRe: Using VC++ ??? Pin
Ravi Bhavnani13-Dec-06 4:16
professionalRavi Bhavnani13-Dec-06 4:16 
QuestionC# Google Connection Method Pin
Matt F.25-Jul-06 4:27
Matt F.25-Jul-06 4:27 
Generalwww.zeta-software.de/Translator Pin
Uwe Keim23-Jan-06 23:13
sitebuilderUwe Keim23-Jan-06 23:13 
QuestionWhich to use: Google or Babelfish? Pin
Goalie3518-Jan-06 5:07
Goalie3518-Jan-06 5:07 
AnswerRe: Which to use: Google or Babelfish? Pin
Steve MunLeeuw7-Feb-06 21:36
Steve MunLeeuw7-Feb-06 21:36 
GeneralRe: Which to use: Google or Babelfish? Pin
Goalie358-Feb-06 3:39
Goalie358-Feb-06 3:39 
GeneralFixed: Problems with postsourcedata Pin
Campbell Gunn15-Jan-06 16:31
Campbell Gunn15-Jan-06 16:31 
GeneralVB version for both Bablefish and Google Pin
TonyDebenport10-Jan-06 5:56
TonyDebenport10-Jan-06 5:56 
Imports System.Web.Services
Imports System
Imports System.Collections
Imports System.ComponentModel
Imports System.Data
Imports System.Diagnostics
Imports System.Web
Imports System.Net
Imports System.Text
Imports System.Text.RegularExpressions
Imports System.IO

<WebService(Namespace:="http://www.redgreenyellowbluegreenpinkpurplewhite.com/Translation/")> _
Public Class Translate
Inherits System.Web.Services.WebService

Public Sub New()
MyBase.New()

'This call is required by the Web Services Designer.
InitializeComponent()

'Add your own initialization code after the InitializeComponent() call

End Sub

'Required by the Web Services Designer
Private components As System.ComponentModel.IContainer

'NOTE: The following procedure is required by the Web Services Designer
'It can be modified using the Web Services Designer.
'Do not modify it using the code editor.
<System.Diagnostics.DebuggerStepThrough()> Private Sub InitializeComponent()
components = New System.ComponentModel.Container
End Sub

Protected Overloads Overrides Sub Dispose(ByVal disposing As Boolean)
'CODEGEN: This procedure is required by the Web Services Designer
'Do not modify it using the code editor.
If disposing Then
If Not (components Is Nothing) Then
components.Dispose()
End If
End If
MyBase.Dispose(disposing)
End Sub


Private Const ERRORSTRINGSTART As String = "<font color=red>"

Private Const ERRORSTRINGEND As String = "</font>"

<WebMethod(Description:="zh_en=Chinese-simp to English, zt_en=Chinese-trad to English, en_zh=English to Chinese-simp, " & _
" en_zt=English to Chinese-trad, en_nl=English to Dutch, en_fr=English to French, en_de=English to German, en_el=English to Greek" & _
" en_it=English to Italian, en_ja=English to Japanese, en_ko=English to Korean, en_pt=English to Portuguese, en_ru=English to Russian," & _
" en_es=English to Spanish, nl_en=Dutch to English, nl_fr=Dutch to French,fr_nl=French to Dutch, fr_en=French to English, fr_de=French to German, fr_el=French to Greek," & _
" fr_it=French to Italian, fr_pt=French to Portuguese, fr_es=French to Spanish, de_en=German to English, de_fr=German to French, el_en=Greek to English, el_fr=Greek to French, it_en=Italian to English," & _
" it_fr=Italian to French, ja_en=Japanese to English, ko_en=Korean to English, pt_en=Portuguese to English, pt_fr=Portuguese to French, ru_en=Russian to English, es_en=Spanish to English, es_fr=Spanish to French")> _
Public Function BabelFish(ByVal translationmode As String, ByVal sourcedata As String) As String
Dim VALIDTRANSLATIONMODES() As String = New String() {"zh_en", "zt_en", "en_zh", "en_zt", "en_nl", "en_fr", "en_de", "en_el", "en_it", "en_ja", "en_ko", "en_pt", "en_ru", "en_es", "nl_en", "nl_fr", "fr_en", "fr_de", "fr_el", "fr_it", "fr_pt", "fr_nl", "fr_es", "de_en", "de_fr", "el_en", "el_fr", "it_en", "it_fr", "ja_en", "ko_en", "pt_en", "pt_fr", "ru_en", "es_en", "es_fr"}
Dim SITEURL As String = "http://babelfish.altavista.com/babelfish/tr"
Dim SITEREFERER As String = "http://babelfish.altavista.com/"

Try
' validate and remove trailing spaces
If ((translationmode = Nothing) _
OrElse (translationmode.Length = 0)) Then
Throw New ArgumentNullException("translationmode")
End If
If ((sourcedata = Nothing) _
OrElse (translationmode.Length = 0)) Then
Throw New ArgumentNullException("sourcedata")
End If
translationmode = translationmode.Trim
sourcedata = sourcedata.Trim
' check for valid translationmodes
Dim validtranslationmode As Boolean = False
Dim i As Integer = 0
Do While (i < VALIDTRANSLATIONMODES.Length)
If (VALIDTRANSLATIONMODES(i) = translationmode) Then
validtranslationmode = True
End If
i = (i + 1)
Loop
If Not validtranslationmode Then
Return (ERRORSTRINGSTART + ("ERROR1:The translation mode specified was not a valid translation translation mode" + ERRORSTRINGEND))
End If
Dim uri As Uri = New Uri(SITEURL)
Dim request As HttpWebRequest = CType(WebRequest.Create(uri), HttpWebRequest)
request.Referer = SITEREFERER
' Encode all the sourcedata
Dim postsourcedata As String
postsourcedata = ("lp=" _
+ (translationmode + ("&tt=urltext&intl=1&doit=done&trtext=" + HttpUtility.UrlEncode(sourcedata))))
request.Method = "POST"
request.ContentType = "application/x-www-form-urlencoded"
request.ContentLength = postsourcedata.Length
request.UserAgent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"

Dim writeStream As Stream = request.GetRequestStream
Dim encoding As UTF8Encoding = New UTF8Encoding
Dim bytes() As Byte = encoding.GetBytes(postsourcedata)
writeStream.Write(bytes, 0, bytes.Length)
writeStream.Close()
Dim response As HttpWebResponse = CType(request.GetResponse, HttpWebResponse)
Dim responseStream As Stream = response.GetResponseStream
Dim readStream As StreamReader = New StreamReader(responseStream, encoding.UTF8)
Dim page As String = readStream.ReadToEnd
Dim reg As Regex = New Regex("<div style=padding:10px;>((?:.|\n)*?)</div>", RegexOptions.IgnoreCase)
Dim matches As MatchCollection = reg.Matches(page)
If ((matches.Count <> 1) _
OrElse (matches(0).Groups.Count <> 2)) Then
Return (ERRORSTRINGSTART + ("ERROR2:The HTML returned from Babelfish appears to have changed. Please check for an updated regular express" & _
"ion" + ERRORSTRINGEND))
End If
Return matches(0).Groups(1).Value
Catch ex As ArgumentNullException
Return (ERRORSTRINGSTART _
+ (ex.Message + ERRORSTRINGEND))
Catch ex As ArgumentException
Return (ERRORSTRINGSTART _
+ (ex.Message + ERRORSTRINGEND))
Catch ex As WebException
Return (ERRORSTRINGSTART + ex.Message + ("ERROR3:There was a problem connecting to the Babelfish server" + ERRORSTRINGEND))
Catch ex As System.Security.SecurityException
Return (ERRORSTRINGSTART + ("ERROR4:You do not have permission to make HTTP connections. Please check your assembly's permission s" & _
"ettings" + ERRORSTRINGEND))
Catch ex As Exception
Return (ERRORSTRINGSTART + ("ERROR5:An unspecified error occured: " _
+ (ex.Message + ERRORSTRINGEND)))
End Try
End Function

<WebMethod(Description:="en|de=English to German, en|es=English to Spanish, en|fr=English to French, en|it=English to Italian, en|pt=English to Portuguese, en|ja=English to Japanese BETA, en|ko=English to Korean BETA, en|zh-CN=English to Chinese(Simplified) BETA, de|en=German to English, de|fr=German to French, es|en=Spanish to English, fr|en=French to English, fr|de=French to German, it|en=Italian to English, pt|en=Portuguese to English, ja|en=Japanese to English BETA, ko|en=Korean to English BETA, zh-CN|en=Chinese (Simplified) to English BETA")> _
Public Function GoogleTranslate(ByVal translationmode As String, ByVal sourcedata As String) As String
Dim VALIDTRANSLATIONMODES() As String = New String() {"en|de", "en|es", "en|fr", "en|it", "en|pt", "en|ja", "en|ko", "en|zh-CN", "de|en", "de|fr", "es|en", "fr|en", "fr|de", "it|en", "pt|en", "ja|en", "ko|en", "zh-CN|en"}
Dim SITEURL As String = "http://translate.google.com/translate_t"
Dim SITEREFERER As String = "http://translate.google.com/"

Try
' validate and remove trailing spaces
If ((translationmode = Nothing) _
OrElse (translationmode.Length = 0)) Then
Throw New ArgumentNullException("translationmode")
End If
If ((sourcedata = Nothing) _
OrElse (translationmode.Length = 0)) Then
Throw New ArgumentNullException("sourcedata")
End If
translationmode = translationmode.Trim
sourcedata = sourcedata.Trim
' check for valid translationmodes
Dim validtranslationmode As Boolean = False
Dim i As Integer = 0
Do While (i < VALIDTRANSLATIONMODES.Length)
If (VALIDTRANSLATIONMODES(i) = translationmode) Then
validtranslationmode = True
End If
i = (i + 1)
Loop
If Not validtranslationmode Then
Return (ERRORSTRINGSTART + ("ERROR1:The translation mode specified was not a valid translation translation mode" + ERRORSTRINGEND))
End If
Dim uri As Uri = New Uri(SITEURL)
Dim request As HttpWebRequest = CType(WebRequest.Create(uri), HttpWebRequest)
request.Referer = SITEREFERER
' Encode all the sourcedata
Dim postsourcedata As String
postsourcedata = ("langpair=" _
+ (translationmode + ("&text=" + HttpUtility.UrlEncode(sourcedata))))
request.Method = "POST"
request.ContentType = "application/x-www-form-urlencoded"
request.ContentLength = postsourcedata.Length
request.UserAgent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"

Dim writeStream As Stream = request.GetRequestStream
Dim encoding As UTF8Encoding = New UTF8Encoding
Dim bytes() As Byte = encoding.GetBytes(postsourcedata)
writeStream.Write(bytes, 0, bytes.Length)
writeStream.Close()
Dim response As HttpWebResponse = CType(request.GetResponse, HttpWebResponse)
Dim responseStream As Stream = response.GetResponseStream
Dim readStream As StreamReader = New StreamReader(responseStream, encoding.UTF8)
Dim page As String = readStream.ReadToEnd
'<textarea name=q rows=5 cols=45 wrap=PHYSICAL>maison</textarea>
Dim reg As Regex = New Regex("<textarea name=q rows=5 cols=45 wrap=PHYSICAL>((?:.|\n)*?)</textarea>", RegexOptions.IgnoreCase)
Dim matches As MatchCollection = reg.Matches(page)
If ((matches.Count <> 1) _
OrElse (matches(0).Groups.Count <> 2)) Then
Return (ERRORSTRINGSTART + ("ERROR2:The HTML returned from Google appears to have changed. Please check for an updated regular express" & _
"ion" + ERRORSTRINGEND))
End If
Return matches(0).Groups(1).Value
Catch ex As ArgumentNullException
Return (ERRORSTRINGSTART _
+ (ex.Message + ERRORSTRINGEND))
Catch ex As ArgumentException
Return (ERRORSTRINGSTART _
+ (ex.Message + ERRORSTRINGEND))
Catch ex As WebException
Return (ERRORSTRINGSTART + ex.Message + ("ERROR3:There was a problem connecting to the Google server" + ERRORSTRINGEND))
Catch ex As System.Security.SecurityException
Return (ERRORSTRINGSTART + ("ERROR4:You do not have permission to make HTTP connections. Please check your assembly's permission s" & _
"ettings" + ERRORSTRINGEND))
Catch ex As Exception
Return (ERRORSTRINGSTART + ("ERROR5:An unspecified error occured: " _
+ (ex.Message + ERRORSTRINGEND)))
End Try
End Function

End Class

Generalcode does work Pin
Steve MunLeeuw6-Jan-06 12:20
Steve MunLeeuw6-Jan-06 12:20 
GeneralCool Pin
xingfang2-Sep-05 13:07
xingfang2-Sep-05 13:07 
Generalproblem for connection Pin
marks4161-Aug-05 7:02
marks4161-Aug-05 7:02 
GeneralRe: problem for connection Pin
marks4161-Aug-05 7:04
marks4161-Aug-05 7:04 
GeneralProvided code no longer works. Pin
npsinboro27-Jun-05 8:13
npsinboro27-Jun-05 8:13 
QuestionVB.net version but with an error, anyone help? Pin
Levyuk14-Jun-05 3:37
Levyuk14-Jun-05 3:37 
GeneralLatest languages supported code line Pin
Christopher Scholten14-Apr-05 22:46
professionalChristopher Scholten14-Apr-05 22:46 
GeneralError: The underlying connection was closed: The server committed an HTTP protocol violation. Pin
RichTeel27-Jan-05 4:04
RichTeel27-Jan-05 4:04 
GeneralRe: Error: The underlying connection was closed: The server committed an HTTP protocol violation. Pin
Anonymous10-Mar-05 4:37
Anonymous10-Mar-05 4:37 
QuestionHello! Can i ask something? Pin
koreanboy20-Sep-04 9:16
susskoreanboy20-Sep-04 9:16 
QuestionAny idea how create Web Service for Flash chat? Pin
Anonymous15-May-04 7:19
Anonymous15-May-04 7:19 
Generalgood! Pin
rj4529-Apr-04 8:40
rj4529-Apr-04 8:40 
GeneralThe HTML returned from Babelfish appears to have changed Pin
rakeshchowdary_s12-Apr-04 23:04
rakeshchowdary_s12-Apr-04 23:04 
GeneralRe: The HTML returned from Babelfish appears to have changed Pin
Matthew Brealey12-Apr-04 23:37
Matthew Brealey12-Apr-04 23:37 
GeneralRe: The HTML returned from Babelfish appears to have changed Pin
Matthew Brealey12-Apr-04 23:39
Matthew Brealey12-Apr-04 23:39 
GeneralRe: The HTML returned from Babelfish appears to have changed Pin
rakeshchowdary_s13-Apr-04 23:25
rakeshchowdary_s13-Apr-04 23:25 
GeneralCannot start the service Pin
Member 3602153-Mar-04 1:24
Member 3602153-Mar-04 1:24 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.