Click here to Skip to main content
15,879,535 members
Articles / Programming Languages / Javascript
Article

Write bangla Unicode in bijoy rule without installing bijoy software

Rate me:
Please Sign up or sign in to vote.
5.00/5 (7 votes)
1 Jun 2012CPOL4 min read 53.1K   950   5   15
Write bangla unicode in bijoy rules without installing bijoy software


Introduction 

Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems. it is the insertion of a specific Unicode character on a computer by a user  

Unicode characters can be inserted in two ways: from the screen by means of an applet from which one can select the character, or by certain key sequence on the keyboard. Many systems provide support for Unicode input in some form. 

Unicode input system should provide a large repertoire of characters, ideally, all valid Unicode code points. This is different from a keyboard layout which defines keys and their combinations only for a limited number of characters appropriate for a certain locale.  



Bangla Vowels 

 The Bengali vowel letter is called shoroborno (স্বরবর্ণ ). These shoroborno represent six of the seven main vowel sounds of Bengali, along with two vowel diphthongs. All of these are used in both Bengali and Assamese 

 

Bangla Consonants 

Consonant letters are called benjonborno( ব্যঞ্জনবর্ণ ) in Bengali. The names of these letters are typically just the consonant sound plus the inherent vowel. 

Bangla vowels and consonants more information is here. 



Bangla Unicode Writing System  

Suppose, I want to write: আমি বাংলাদেশকে ভালবাসি,আপনি ?According to bijoy writing system we have to write like this: 

অ া ি ম ব া ং ল া ে দ শ ে ক ভ া ল ব া ি স , অ া প ি ন ? 

The result will show: আিম বাংলােদশেক ভালবািস, আপিন ? (According to Unicode rule, this is right)



To get expected result we have to write like this:

অ া ম ি ব া ং ল া দ ে শ ক ে ভ া ল ব া স ি , অ া প ন ি ?

That is not practice in bijoy rule. 

Unicode is joining Bengali vowels (স্বরবর্ণ shoroborno) and Bengali consonants (ব্যঞ্জনবর্ণ benjonborno) in rules like that:

Consonants (ব্যঞ্জনবর্ণ) + vowels (স্বরবর্ণ)

অ + া = আ

প + ি = পি

স + ে = সে

The vowels are joining with consonants with given attributes by Unicode. For that when we write according to bijoy rules, the generated result is showing different result. 

Unicode chart for Bangla 

Bangla Unicode has own hexadecimal range which is 0x0980-0x09FF. So, All Bangla character is assigned within the given range.  

Like for:   

অ = u0985

আ = u0986

ক = u0995 

খ = u0996  Etc. The chart details is here 

Bangla vowels and consonants more information is here. 



Code Analysis  

There are some array which stores the Unicode characters with code. 

C++
//the word those are use before-character
unijoy['j'] = "\u0995"; // koo
unijoy['d'] = '\u09BF'; // hrossho i kar
unijoy['gd'] = '\u0987'; // hrossho i
unijoy['D'] = '\u09C0'; // dirgho i kar
unijoy['gD'] = '\u0988'; // dirgho i

  

Also stores the vowels which are use to join before consonants. The vowels are: rshi-kar(`ি-কার),  e-kar(`ে- কার),oi-kar(`ৈ- কার )

C++
hr_post_array[0] = 'f'; // e kar
hr_post_array[1] = 'D'; //Oi Kar
hr_post_array[2] = 'X'; // hrossho i kar
hr_post_array[3] = 'Z'; // hrossho i kar

  



When a key pressed in keyboard then at first the key is checked in unijoy[] array which contains the list of Unicode returns the uni-character. Now, when a uni-character is vowels type then its joining with consonants according to its characteristics.

C++
//----------------------------------------------Check the bangla pre-vowel-------
function Check_Pre_Character_Exist(p_char) {
 
    var tag_found = 0;
    for (var p_char_index = 0; p_char_index < rashed_swap_array.length; p_char_index++) {
 
        if (rashed_swap_array[p_char_index].toString() == p_char) {
            tag_found = 1;
            return tag_found;
        }
    }
    return tag_found;
}  

  But problem is when a vowel which joined in before of consonants comes, then according to its characteristics join with previous consonants. so I used '`' character to make distance between the consonants and vowels. like: হ`ি.  

  var hr_post_array = new Array(); 

C++
if (Check_Pre_Character_Exist(get_unicode_to_character_rashed(myValue)) == 1) {
            // if user change the cursor and give e-kar in front of any character then
            if (myField.value.substring(startPos, startPos + 1) == ' '
                || myField.value.substring(startPos, startPos + 1) == '') {
                myField.value = myField.value.substring(0, startPos)
                        + '`'
                        + myValue
		                + myField.value.substring(endPos, myField.value.length);
            }
            else {
                myField.value = myField.value.substring(0, startPos + 1)
                    + myValue
                    + myField.value.substring(startPos + 1);
            }
        } 
C++
else if (Check_Pre_Character_Exist(carry_rashed.substring(0, 1))) {
   if (myField.value.substring(startPos - 2, startPos - 1) == '`') {
               myField.value = myField.value.substring(0, startPos - 2)
                   + myValue
                   + myField.value.substring(startPos - 1, startPos)
                   + myField.value.substring(endPos, myField.value.length);
               }
   else {
               myField.value = myField.value.substring(0, startPos - 1)
               + myValue
               + myField.value.substring(startPos - 1, startPos) + myField.value.substring(endPos, myField.value.length);
           }
       }

That means I have to add next consonants with rshi-kar(`ি-কার). before adding a consonants first check its environment that which character already before. So, when the consonants find pre-vowels(e-kar,rshi-kar,oi-kar) then it just alter the position and makes expected result. Suppose, I want to add 'ক' now. then the logical position becomes: হ`ি +ক =হক  `ি  = হকি 

 

The others joining of vowel in same structure. 

Combined Joining 

In bangla language there are some word which are created with joining two or multiple character    like :ক্ত,গ্গ,জ্ক,স্ত.... etc. The combine character join with 'g'. that means when we want to join two character then g using for joining. If we press g then bangla unicode ্ returns from unicode array. The pre-vowel with those combined character then the sequence should maintain. like: পক্তি. then user typing প+`ি+ক +   ্  +ত : 

the logical serial becomes:   প+ক +   ্  +ত  +`ি 

 

C++
if (myField.value.substring(startPos - 1, startPos) != ' ') {
                if (carry_rashed.substring(1, 2) == 'g') {
                    myField.value = myField.value.substring(0, startPos)
		            + myValue
		            + myField.value.substring(endPos, myField.value.length);
                }
                else {
 
                    if (carry_rashed.substring(0, 1) == 'g') {
                        if (Check_Pre_Character_Exist(get_unicode_to_character_rashed(myField.value.substring(startPos - 2, startPos - 1))) == 1) {
                            myField.value = myField.value.substring(0, startPos - 2)
		                + myField.value.substring(startPos - 1, startPos)
                        + myValue
		                + myField.value.substring(startPos - 2, startPos - 1)
		                + myField.value.substring(endPos, myField.value.length);
                        }
                        else {
                            myField.value = myField.value.substring(0, startPos)
		                    + myValue
		                    + myField.value.substring(endPos, myField.value.length);
                        }
                    }
                    else {
                        myField.value = myField.value.substring(0, startPos)
		                    + myValue
		                    + myField.value.substring(endPos, myField.value.length);
                    }
 
                } 
 

Some combine character made by "Shift + A" which returns ref(র্). Suppose we want to write "কর্ন"  in that case key sequence will: ক ন র্  . If we want add  ে with this word then the we are typing in this sequence :   ক  ে ন র্  but to get the expected result the logical sequence should be:  ক  ন র্ ে   

 

C++
else if (char_e == "A") {
        newChar = unijoy['v'] + '\u09CD' + lastInserted;
        var value_field = "";
 
        var startPos = myField.selectionStart;
 
        var last_fifth_one = myField.value.substring(startPos - 5, startPos - 4);
        var last_forth_one = myField.value.substring(startPos - 4, startPos - 3);
        var last_third_one = myField.value.substring(startPos - 3, startPos - 2);
        var last_second_one = myField.value.substring(startPos - 2, startPos - 1);
        var last_one = myField.value.substring(startPos - 1, startPos);
 
 
        //
        // if character[last] is character: gorto, 
        //
        if (Check_Character_Exist_orNot(get_unicode_to_character_rashed(last_one)) == 0) {
 
            // if last previous one is g then "ref" add with (g-1) located character
            if (get_unicode_to_character_rashed(last_second_one) == 'g') {
                newChar = unijoy['v'] + '\u09CD' + last_third_one + last_second_one + last_one;
                myField.value = myField.value.substring(0, startPos - 4) + myField.value.substring(startPos - 4, startPos - 3) + newChar;
            }
            else {
                newChar = unijoy['v'] + '\u09CD' + last_one;
                myField.value = myField.value.substring(0, startPos - 2) + myField.value.substring(startPos - 2, startPos - 1) + newChar;
            }
        }
        else {
            if (Check_Character_Exist_orNot(get_unicode_to_character_rashed(last_second_one)) == 1) {
                if (get_unicode_to_character_rashed(last_forth_one) == 'g') {
                    newChar = unijoy['v'] + '\u09CD'+ last_fifth_one+ last_forth_one + last_third_one + last_second_one + last_one;
                    myField.value = myField.value.substring(0, startPos - 6) + myField.value.substring(startPos - 6, startPos - 5) + newChar;
                  
                }
                else {
                    newChar = unijoy['v'] + '\u09CD' + last_third_one + last_second_one + last_one;
                    myField.value = myField.value.substring(0, startPos - 4) + myField.value.substring(startPos - 4, startPos - 3) + newChar;
                }
            }
            else {
                // alert(last_forth_one + ' ' + last_third_one + ' ' + last_second_one + ' ' + last_one);
                //example: gorde
                if (get_unicode_to_character_rashed(last_third_one) == 'g') {
                    newChar = unijoy['v'] + '\u09CD'+last_forth_one + last_third_one + last_second_one + last_one;
                    myField.value = myField.value.substring(0, startPos - 5) + myField.value.substring(startPos - 5, startPos - 4) + newChar;
                }
                else {
                    newChar = unijoy['v'] + '\u09CD' + last_second_one + last_one;
                    myField.value = myField.value.substring(0, startPos - 3) + myField.value.substring(startPos - 3, startPos - 2) + newChar;
                }
            }
 
        }
       
        return false;
    } 
 

Importance 


Bangla unicode with bijoy rule is most important for bangladeshi user because most of official work,
others bangla writing are written with bijoy rules. Bangla unicode is exist but for lacking bijoy
rules they can’t utilize the Unicode system. <o:p>


History

02-June-2012


 

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



Comments and Discussions

 
Questioncan i Pin
phoring7-Nov-12 2:13
phoring7-Nov-12 2:13 
AnswerRe: can i Pin
Member 1051082216-Nov-12 16:49
professionalMember 1051082216-Nov-12 16:49 
GeneralRe: can i Pin
phoring18-Nov-12 21:39
phoring18-Nov-12 21:39 
GeneralRe: can i Pin
pavel0085-Jan-13 1:34
pavel0085-Jan-13 1:34 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.