Have you created a site and wish to spread it globally and add multilingual capabilities? Are you pondering on how to develop a site with several languages? If so, then this article is for you. I will discuss the several options of how to write sites that contain many languages. I will tell the pros and cons of each option while focusing on ease of use, ease of programming, and performance impacts.
This guide is written by me, an experiences old school ASP (Active Server Pages) programmer. I believe the article will bear fruits even for the ASP.Net programmers out there, as they face the same fundamental problems when developing large scale projects. You should know your code to understand the implications of adding multilingual capabilities to it; I usually write my web projects, large as small, using Microsoft® Notepad®, so I know my code rather well. Be too!
Using the code
I don't supply a full archive of source code. I am supplying a general notion on how to prepare multi-language support. The code I write below exhibits a general idea and should not be copied into your own web site; contemplate it and implement it in your site, if I had you convinced.
Global expansion and the problem of multilingual support
I, myself, had the pleasure of being the lead programmer at Bono Pie LTD, Israel at the year 1999. I was only 19 of age at that time; as these were the pioneering years of web services, I was inexperienced as almost everyone else in the field of large scale web development. Nevertheless, and probably as oppose what you imagine until now, Bono Pie LTD had a durable success with about 20 employees, from which about four of them programmers in my team.
The site was programmed in ASP using VBScript and had many flashy HTML, CSS, Flash® files and other ornaments. Our database was Microsoft® SQL® server, after migrating from Microsoft® Access®, and had a lot of DBA hours, many of which my own. We worked hard every day to keep the business pace, creating new features and fixing up bugs.
Seeing in advance that supporting other languages will soon be needed, we started working on how to migrate our site. It wasn't an easy chore as the language of our site was Hebrew (the language used in Israel), in which the letters are written right-to-left (RTL). I will spare the thorough description about how to implement a site that supports LTR (left-to-right) and RTL altogether as I conceive that most of you will not bump into it. So, our problem is reduced to "How do I create the pages of web site to support different languages?"
First, let's examine what should be done. A code snippet from my site might look like this:
We've only gone out together three times,
and already you're telling me you want to be friends?
In order to support multi-language I must somehow integrate the same text in several languages. A somewhat peculiar solution might be:
<% if MyLanguage="English" then%>
We've only gone out together three times,
and already you're telling me you want to be friends?
<% elseif MyLanguage="Spanish" then%>
¿Hemos salido solamente juntos tres veces,
y ya me estás diciendo que quieres ser amigos?
<% end if%>
However, imprinting the texts here will complicate the ability for normal translators to translate the site; I will have to sit with them to see they don't break my code; this is an error prone resolution. In contrast, I will want to have a design that will allow me to combine all texts needed to be translated into a single point. Moreover, it is easy to see that human interaction must take part as some parts need to be translated, the text, while others, the HTML or ASP part, need not to be translated.
I knew I have to add this ability while maintaining my freedom for arbitrarily changing the code. Therefore, a solution as copying the site and manually changing each copy for a new language is disastrous as I will be able to maintain my code no more. So, we knew that we need to have only one source of code. However, if we create only one source of code, how will it transfer itself to several languages? An easy but somewhat faulty solution is to embed the language specific datum as an ASP directive most notably as a function call with a unique ID for identifying the specific string; this method is straightforward for new age developers as it resembles a String Table. An example:
Although this mechanism supposedly resembles my final solution, it is quite different. The problems with the proposed method is that not all language specific content can have such directives; we cannot add this into HTML pages as well as images. If we were to add such command into an HTML file, it will be displayed to the users and will not help us in achieving multilingual support. A different approach for supporting HTML files is to physically split the languages to several HTML files; however, this approach tends to be related to my first suggestion; in addition, it places new obstacles as the code now has to take into consideration the language used for picking up the right HTML file. All that is just said is also true for image files (PNG, JPG, GIF, etc.). Someone might say a satisfying solution is simply to change the HTML files to ASP files; however this scenario has implications upon performance and upon the structure of the site. So, we need a more general answer.
In order to support text in all file type and not to affect the site, s a different approach had to be taken. Instead of integrating the text directive into the file, we put the text directive into one file, henceforth "The Code", and the file with the text into a different file, henceforth "The Result"; viz. The code contains a marker for text and the result files contain text for each language. To achieve this, we created a small ASP site, henceforth "The engine", which generates the result files dynamically. This way, when we want to change anything in our site we need to edit the code and easily generate the result files. Sure, our solution is not as automate as the aforementioned solution; however, it does support all textual file types and even was expanded to support also other file types, such as image and Macromedia® Flash® files.
The proposed solution
First, we needed to put a specific mark inside the code pages so that our engine will know where to put translated text. A pretty good notation will be something like:
<GetLinguisticText ID="FamousQuotationPage_TitleHeader" />
Using such notation in the code files will allow for future options to be used by other attributes. Our engine will look for
tag inside all code pages and will replace them with the proper text when constructing the result files. A straightforward engine code can be:
Sub GenerateResultFileFromCodeFile(CodeFileName, ResultLanguage)
Set CodeFile=FSO.OpenTextFile(GetCodeDir() + CodeFileName, 1)
If IndexOfTag>1 Then
If Left(CodeText, Len(IdAttributeStart))<>IdAttributeStart Then
Err.Description = "Invalid input encountered while parsing
code page to translate (" + CodeFileName + ")"
Err.Source = "Translator"
+ CodeFileName, 2, True, -2)
It took me about half an one hour to write this code in Notepad® and about one minute and a half to debug it, using three iterations. All you need to do now is to create the three missing functions:
Function GetLinguisticText(StringID, ResultLanguage)
; the latter should probably go into the String Table inside your database.
Now you can actually rebuild your site using the new mechanism. This is where it gets tedious: you manually need to insert all your text into its destined String Table and replace it with the
tag with its corresponding ID.
Once you finished with laboring, you can recreate your site using a call to
for all your pages. You could probably use a table to list all your pages. Another table to hold all the languages will suit as well. I also created a table to list all the TextIDs and a table for every language for its translation, holding a key and value columns. Feel free to express yourself.
Easing the translation process
Of course you can ask a translator to translate texts directly inside your database, but don't expect too much of him or her. You will probably do better if you could display the text taken from a different language or several, if possible. I remember I created a form with the ID, the text in Hebrew and the destined language; the translator was only able to change the content of the language being translated to; also a previous and next buttons were supplied to allow scrolling the different IDs. So, to translate the site to a new language, Japanese in my case, a translator will to translate numerous texts using a single click routine.
Some of the text that is being displayed in web sites is actually fragmented in its code form, for example:
Hello <%= UserName%>, thank you for coming.
With the translation mechanism so far, the programmer will have to do something like this:
<GenerateResultFileFromCodeFile ID="MainPage_GreetingBeforeUserName" />
<%= UserName%><GenerateResultFileFromCodeFile ID="MainPage_GreetingAfterUserName" />
MainPage_GreetingBeforeUserName::English = "Hello "
MainPage_GreetingBeforeUserName::Spanish = "Hola "
MainPage_GreetingAfterUserName::English = ", thank you for coming."
MainPage_GreetingAfterUserName::Spanish = ", gracias por venire."
While it seems rather good, translation of this tends to be out of context and the translation tends to break when being translated from one language to another. As our engine creates other pages, even ASP pages, we can even put the small ASP code inside the translation unit. viz:
<GenerateResultFileFromCodeFile ID="MainPage_Greeting" />
MainPage_Greeting::English = "Hello <%= UserName%>, thank you for coming."
MainPage_Greeting::Spanish = "Hola <%= UserName%>, gracias por venire."
However, if the code is rather long and have many ASP directives, it would be advisable not to include it. But, feel free to add symbols: smiley, trademark; HTML code: a <br>, an <hr>; and the like if they contained in the text and are easy to follow. Even so, a good addition I have made was to connect each TextID with it Page and previewing the page to the translator; viz. when the translator translate a given TextID, he or she has a frame of half the screen with the page with the text that is being translated; this mitigates the context problem. In addition, grouping all the content of a specific page is effectual as it hasten the duration of translation.
I vaguely remember what exactly we did regarding other files, mainly image files. I remember our site had a prototype image, using our mother tongue Hebrew language and the translators had to upload the same image in their own language. I even remember we put our source images (Photoshop® PSD files, for instance) on the same back office site; so adding an images table with the image source file and also the names of the pictures in all the different languages will suffice.