65.9K
CodeProject is changing. Read more.
Home

Testing number inputs with special symbols

starIconstarIconstarIconstarIcon
emptyStarIcon
starIcon

4.17/5 (3 votes)

Mar 12, 2014

CPOL
viewsIcon

7589

Tip for testing numeric inputs with special Unicode symbols.

Digits in Regular Expression

Developers in most cases use regular expressions to verify number inputs.

There are two ways to match a digit via regular expression:

- [0-9] matches an Arabic numeral, i.e. 0,1,2,3,4,5,6,7,8,9;

- \d matches a Unicode number.

In addition to Arabic numerals Unicode contains more than 300 numbers from different cultures:

0: 0,٠,۰,߀,०,০,੦,૦,୦,௦,౦,೦,൦,๐,໐,༠,၀,႐,០,᠐,᥆,᧐,᭐,᮰,᱀,᱐,꘠,꣐,꤀,꩐,0
1: 1,١,۱,߁,१,১,੧,૧,୧,௧,౧,೧,൧,๑,໑,༡,၁,႑,១,᠑,᥇,᧑,᭑,᮱,᱁,᱑,꘡,꣑,꤁,꩑,1
2: 2,٢,۲,߂,२,২,੨,૨,୨,௨,౨,೨,൨,๒,໒,༢,၂,႒,២,᠒,᥈,᧒,᭒,᮲,᱂,᱒,꘢,꣒,꤂,꩒,2
3: 3,٣,۳,߃,३,৩,੩,૩,୩,௩,౩,೩,൩,๓,໓,༣,၃,႓,៣,᠓,᥉,᧓,᭓,᮳,᱃,᱓,꘣,꣓,꤃,꩓,3
4: 4,٤,۴,߄,४,৪,੪,૪,୪,௪,౪,೪,൪,๔,໔,༤,၄,႔,៤,᠔,᥊,᧔,᭔,᮴,᱄,᱔,꘤,꣔,꤄,꩔,4
5: 5,٥,۵,߅,५,৫,੫,૫,୫,௫,౫,೫,൫,๕,໕,༥,၅,႕,៥,᠕,᥋,᧕,᭕,᮵,᱅,᱕,꘥,꣕,꤅,꩕,5
6: 6,٦,۶,߆,६,৬,੬,૬,୬,௬,౬,೬,൬,๖,໖,༦,၆,႖,៦,᠖,᥌,᧖,᭖,᮶,᱆,᱖,꘦,꣖,꤆,꩖,6
7: 7,٧,۷,߇,७,৭,੭,૭,୭,௭,౭,೭,൭,๗,໗,༧,၇,႗,៧,᠗,᥍,᧗,᭗,᮷,᱇,᱗,꘧,꣗,꤇,꩗,7
8: 8,٨,۸,߈,८,৮,੮,૮,୮,௮,౮,೮,൮,๘,໘,༨,၈,႘,៨,᠘,᥎,᧘,᭘,᮸,᱈,᱘,꘨,꣘,꤈,꩘,8
9: 9,٩,۹,߉,९,৯,੯,૯,୯,௯,౯,೯,൯,๙,໙,༩,၉,႙,៩,᠙,᥏,᧙,᭙,᮹,᱉,᱙,꘩,꣙,꤉,꩙,9  

Using these Unicode numbers it is possible to test number inputs on correctness.
For example, in most cases it is expected that a phone number will contain only Arabic numbers. It is easy to check by providing special symbols, for example, "١٢٣" instead of "123", or some Indian Unicode numbers:  (0), (1), (2), etc.

Below is an example of a valid Microsoft check for an Azure account recovery phone:

Code to generate special digit symbols

.NET considers[0-9] and \d as different expressions, below is the C# script to find all Unicode numbers:

 var stringBuilder = new StringBuilder();
 
 var digitRegex = new Regex(@"\d");
 var charDigitGroups = Enumerable.Range(Char.MinValue, Char.MaxValue)
                                 .Select(Convert.ToChar)
                                 .Where(ch => digitRegex.IsMatch(ch.ToString()))
                                 .GroupBy(ch => Char.GetNumericValue(ch));
 
foreach (var charGroup in charDigitGroups)
{
      string joinedValues = String.Join(",", charGroup);
      string rowResult = String.Concat(charGroup.Key.ToString(), ": ", joinedValues);
      stringBuilder.AppendLine(rowResult); 
} 

Some languages like JavaScript do not support Unicode in regular expressions by default, so there \d is the same as [0-9]. Nevertheless it is useful to check applications on the Unicode digital input independent on the realization details.

Related Links