Localization in ASP.NET Core 1.0: Pluralization Syntax

Hisham Abdullah Bin Ateya

Rate me:

3.67/5 (3 votes)

6 Mar 2016CPOL4 min read

17.9K

Pluralization in ASP.NET Core 1.0 Localization

Introduction

In the last article we have seen the main extensible points in ASP.NET Localization and how ASP.NET facilitate the process for developers to extend the underlying APIs.

Today I wanna talk about Pluralization, the current implement of the ASP.NET localization is not support pluralization yet, but I already suggest that in the localization repository here, because the pluralization is not an easy task to implement also it needs a good design.

Background

Pluralization is a complex problem, as different languages have a variety of complex rules for pluralization. English language is one of the simplest languages because it have two plural forms: one for the singular and another for plural, which is make it easy to implement pluralization for English Language.

Let us have an example:

1 apple, 2 apples and 100 apples

As we said before "1 apple" is a singular form, while the others are plural form, some of us will said it's easy to implement the pluralization .. wait a minute!! and have a look to plural forms link or this link, perhaps you will not believe that there are some language has more complex rules for pluralization such as Arabic Lanaguage, which is my mother language :)

Using the code

It's code time, let us simplify the entire process of pluralization, as we know there's no one solution for this problem and if you look to many programming languages and frameworks there are different flavors, so let us see what can I come up with.

First of let us implement a simple pluralization for english language. As we mentioned before English language has two plural form, so it's easy to create a simple function that give us a proper form.

public static string Plural(this IStringLocalizer localizer, bool isPlural, string name, params object[] arguments)
{
    string value = localizer[name,arguments];
    int index = (isPlural ? 1 : 0);
    return value.Split('|')[index];
}

I presume that value of the key in the resource file is separated by "|" to distinguish between the singular and plural forms, and this is will applied into the underneath examples.

In the above example I extend the IStringLocalizer interface to have a new method named Plural, which will give us the right form, and resource file will look like

XML

<data name="apple" xml:space="preserve">
    <value>{0} apple|{0} apples</value>
</data>

After that we can simple used as T.Plural("apple", false) to get the singular form and T.Plural("apple", true) to get the plural form.

Now let us dig into more realistic code, because there are many language other than English.

In the following section I will dig into two ways to implement the pluralization:

1- Implicit

In this way the pluralization rules are implicit, all the magic will happen behind the scene.

msgid "%s apple"
msgid_plural "%s apples"
msgstr[0] ""
msgstr[1] ""
"Project-Id-Version: Space9\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2014-10-24 00:49+0200\n"
"PO-Revision-Date: 2014-10-24 00:49+0200\n"
"Last-Translator: Anastis Sourgoutsidis <anastis@cssigniter.com>\n"
"Language-Team: CSSIgniter LLC <info@cssigniter.com>\n"
"Language: el\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"X-Poedit-SourceCharset: UTF-8\n"
"X-Poedit-KeywordsList: __;_e;__ngettext:1,2;_n:1,2;__ngettext_noop:1,2;"
"_n_noop:1,2;_c,_nc:4c,1,2;_x:1,2c;_nx:4c,1,2;_nx_noop:4c,1,2;_ex:1,2c;"
"esc_attr__;esc_attr_e;esc_attr_x:1,2c;esc_html__;esc_html_e;esc_html_x:1,2c\n"
"X-Poedit-Basepath: .\n"
"X-Textdomain-Support: yes\n"
"X-Generator: Poedit 1.6.10\n"
"Plural-Forms: nplurals=2; plural=(n != 1);\n"
"X-Poedit-SearchPath-0: .\n"
"X-Poedit-SearchPath-1: ..\n"

There are some interesting lines

"Plural-Forms: nplurals=2; plural=(n != 1);\n" which is define the plural forms for the English language

msgid "%s apple"
msgid_plural "%s apples"
msgstr[0] ""
msgstr[1] ""

which define the singular and plural keys, and the values which is in this case are two.

When I was thinking to implement that I asked myself should I have n keys per language? the answer it depends but for generic case the resource file will be large, specilally for those languages which have more than two plurals forms, again I'm think for Arabic language which have six forms :), so I come up with an idea to have all the values per key separated by "|" pipe symbol, in this case I will reduce the amount of key value pair in the resource file regardless what the language is.

public static string Plural(this IStringLocalizer localizer, string name, params object[] arguments)
{
    string value = localizer[name,arguments];
    int count = Convert.ToInt32(arguments[0]);
    int plural = GetPluralForms(count);
    return value.Split('|')[plural];
}

The code is quite simple, using IStringLocalizer to get the value of the passed key, after that I called the magic function GetPluralForms() which gets the number of the plural forms for the current language as the following:

private static int GetPluralForms(int n)
{
    string code = Thread.CurrentThread.CurrentCulture.TwoLetterISOLanguageName;
    int plural=0;
    switch (code)
    {
        // nplural=1
        case "ay":
        case "bo":
        case "cgg":
        case "dz":
        case "fa":
        case "id":
        case "ja":
        case "jbo":
        case "ka":
        case "kk":
        case "km":
        case "ko":
        case "ky":
        case "lo":
        case "ms":
        case "my":
        case "sah":
        case "su":
        case "th":
        case "tt":
        case "ug":
        case "vi":
        case "wo":
        case "zh_CN":
        case "zh_HK":
        case "zh_TW":
            plural = 0;
            break;
        // nplural=2
        case "ach":
        case "ak":
        case "am":
        case "arn":
        case "br":
        case "fil":
        case "fr":
        case "gun":
        case "ln":
        case "mfe":
        case "mg":
        case "mi":
        case "oc":
        case "pt_BR":
        case "tg":
        case "ti":
        case "tr":
        case "uz":
        case "wa":
            plural = (n > 1 ? 1 : 0);
            break;
        case "af":
        case "an":
        case "anp":
        case "as":
        case "ast":
        case "az":
        case "bg":
        case "bn":
        case "brx":
        case "ca":
        case "da":
        case "de":
        case "doi":
        case "el":
        case "en":
        case "eo":
        case "es":
        case "es_AR":
        case "et":
        case "eu":
        case "ff":
        case "fi":
        case "fo":
        case "fur":
        case "fy":
        case "gl":
        case "gu":
        case "ha":
        case "he":
        case "hi":
        case "hne":
        case "hu":
        case "hy":
        case "ia":
        case "it":
        case "kl":
        case "kn":
        case "ku":
        case "lb":
        case "mai":
        case "ml":
        case "mn":
        case "mni":
        case "mr":
        case "nah":
        case "nap":
        case "nb":
        case "ne":
        case "nl":
        case "nn":
        case "no":
        case "nso":
        case "or":
        case "pa":
        case "pap":
        case "pms":
        case "ps":
        case "pt":
        case "rm":
        case "rw":
        case "sat":
        case "sco":
        case "sd":
        case "se":
        case "si":
        case "so":
        case "son":
        case "sq":
        case "sv":
        case "sw":
        case "ta":
        case "te":
        case "tk":
        case "ur":
        case "yo":
            plural = (n != 1 ? 1 : 0);
            break;
        case "is":
            plural = (n % 10 != 1 || n % 100 == 11 ? 1 : 0);
            break;
        case "jv":
            plural = (n != 0 ? 1 : 0);
            break;
        case "mk":
            plural = (n == 1 || n % 10 == 1 ? 0 : 1);
            break;
        // nplural=3
        case "be":
        case "bs":
        case "hr":
        case "lt":
            plural = (n % 10 == 1 && n % 100 != 11 ? 0 : n % 10 >= 2 && n % 10 <= 4 && (n % 100 <10 || n % 100 >= 20) ? 1 : 2);
            break;
        case "cs":
            plural = ((n == 1) ? 0 : (n >= 2 && n <= 4) ? 1 : 2);
            break;
        case "csb":
        case "pl":
            plural = ((n == 1) ? 0 : n % 10 >= 2 && n % 10 <= 4 && (n % 100 < 10 || n % 100 >= 20) ? 1 : 2);
            break;
       case "lv":
            plural = (n % 10 == 1 && n % 100 != 11 ? 0 : n != 0 ? 1 : 2);
            break;
      case "mnk":
            plural = (n == 0 ? 0 : n == 1 ? 1 : 2);
            break;
      case "ro":
            plural = (n == 1 ? 0 : (n == 0 || (n % 100 > 0 && n % 100 < 20)) ? 1 : 2);
            break;
        // nplural=4
        case "cy":
            plural = ((n == 1) ? 0 : (n ==2 ) ? 1 : (n != 8 && n != 11) ? 2 : 3);
            break;
       case "gd":
            plural = ((n == 1 || n == 11) ? 0 : (n == 2 || n == 12) ? 1 : (n > 2 && n < 20) ? 2 : 3);
            break;
       case "kw":
            plural = ((n == 1) ? 0 : (n == 2) ? 1 : (n == 3) ? 2 : 3);
            break;
      case "mt":
            plural = (n == 1 ? 0 : n == 0 || ( n % 100 > 1 && n % 100 < 11) ? 1 : (n % 100 > 10 && n % 100 < 20 ) ? 2 : 3);
            break;
       case "sl":
            plural = (n % 100==1 ? 1 : n % 100 == 2 ? 2 : n % 100 == 3 || n % 100 == 4 ? 3 : 0);
            break;
      case "ru":
      case "sr":
      case "uk":
            plural = (n % 10 == 1 && n % 100 != 11 ? 0 : n % 10 >= 2 && n % 10 <= 4 && (n % 100 < 10 || n % 100 >= 20) ? 1 : 2);
            break;
      case "sk":
            plural = ((n == 1) ? 0 : (n >= 2 && n <= 4) ? 1 : 2);
            break;
        // nplural=5
        case "ga":
            plural = (n == 1 ? 0 : n == 2 ? 1 : (n > 2 && n < 7) ? 2 :(n > 6 && n < 11) ? 3 : 4);
            break;
        // nplural=6
        case "ar":
            plural = (n == 0 ? 0 : n == 1 ? 1 : n == 2 ? 2 : n % 100 >= 3 && n % 100 <= 10 ? 3 : n %100 >= 11 ? 4 : 5);
            break;
    }    
    return plural;
}

2- Explicit

You may also create more explicit pluralization rules easily:

apples => "[0] There are no apples|[1-19] There are some apples|[20-*] There are many apples"

This technique is inspired by laravel framework, and if you notice the pluralization rules are more explicit in the values in the resource file.

Here there are three case

[0] which means if the count is equal zero you will get There are no apples

[1-19] which means if the count bewtween one and nineteen you will get There are some apples

[20-*] whcih means if the count is greater than or equl twenty you will get There are many apples

The explicit rules is more powerful, but you need to write them your own, the code of this technique may looks like

public static string Plural(this IStringLocalizer localizer, string name, params object[] arguments)
{
    string value = localizer[name,arguments];
    var parts = value.Split('|');
    var plural="";
    int n = Convert.ToInt32(arguments[0]);
    foreach (var part in parts)
    {           
        var tmp = part.Substring(1, part.IndexOf(']')-1);
        if (tmp.Contains("-"))
        {
            var tokens = tmp.Split('-');
            int min = Convert.ToInt32(tokens[0]);
            int max = (tokens[1]=="*"?int.MaxValue:Convert.ToInt32(tokens[1])); 
            if (n >= min && n <= max)
            {
                plural= part.Split(']')[1];
                break;
            }
        }
        else if(tmp.Contains(","))
        {
            var tokens = tmp.Split(',');
            if (tokens.Any(t=>Convert.ToInt32(t)==n))
            {
                plural= part.Split(']')[1];
                break;
            }
        }
        else
        {
            if(Convert.ToInt32(tmp) == n)
            {
                plural= part.Split(']')[1];
                break;
            }
        }
    }
    return plural;
}

You may need sort of caching to avoid string processing for each requested key, for the sake of the demo I didn't implement that.

Points of Interest

Pluralization is a complex problem, but you can roll you own to implement the flavor that you like.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.