12,696,493 members (31,669 online)
alternative version

19.4K views
37 bookmarked
Posted

# Converting Text Numbers to Numeric Values

, 23 Feb 2015 CPOL
 Rate this:
Spell it out! No! Wait!

### Introduction

Recently, someone in the Lounge complained that sorting the following text was difficult:

• First
• Second
• Third
• ...
• eleventh

The idea was that he wanted to sort them by their numeric equivalent. Well, it's actually NOT all that difficult once you convert the text to their actual numeric values, and even that conversion isn't really difficult. Here is my solution, which is one of many ways to approach the problem.

### Assumptions

This code utilizes recursion and extension methods. It is assumed that you are already familiar with the concepts behind these language components.

## Background

As with an other programming problem, you usually start with a vague list of requirements, a hazy user story, and (usually) very specific results requirements. This problem follows that paradigm. No programmer that is worth a damn (and left to his own devices) is going to provide a solution that merely meets the stated requirements. In this case, the example data stopped at "eleventh", but that doesn't mean squat. You simply MUST assume that the incoming data is not going to be limited to the provided samples, and besides, you may as well code for as many contingencies as is possible (and reasonable).

## Code

To implement this solution, I created three classes. The first one - NumberDictionary - defines a dictionary of terms and their associated decimal values.

```public static class NumberDictionary
{
public static Dictionary<string, decimal=""> Numbers = new Dictionary<string, decimal="">()
{
{"ZERO",   0}, {"TEN",       10}, {"HUNDRED",     100},
{"FIRST",  1}, {"ELEVEN",    11}, {"THOUSAND",    1000},
{"ONE",    1}, {"TWELF",     12}, {"MILLION",     1000000},
{"SECOND", 2}, {"TWELVE",    12}, {"BILLION",     1000000000}, //*
{"TWO",    2}, {"THIRTEEN",  13}, {"MILLIARD",    1000000000},
{"THIRD",  3}, {"FOURTEEN",  14}, {"TRILLION",    1000000000000}, //*
{"THREE",  3}, {"FIFTEEN",   15}, {"QUADRILLION", 1000000000000000},
{"FOUR",   4}, {"SIXTEEN",   16}, {"BILLIARD",    1000000000000000},
{"FIF",    5}, {"SEVENTEEN", 17}, {"QUINTILLION", 1000000000000000000},
{"FIVE",   5}, {"EIGHTEEN",  18}, // I had to stop here because even a
{"SIX",    6}, {"NINETEEN",  19}, // decimal can't hold a sextillion or
{"SEVEN",  7}, {"TWENTY",    20}, // septillion
{"EIGH",   8}, {"THIRTY",    30},
{"NIN",    9}, {"FORTY",     40},
{"NINE",   9}, {"FIFTY",     50},
{"SIXTY",     60},
{"SEVENTY",   70},
{"EIGHTY",    80},
{"NINETY",    90},
};
// * These values are adjusted if the region is a long-scale region
}
</string,>```

People in the US might have noticed the weird ones - "BILLIARD" and "MILLIARD". These terms are used in long-scale countries, where billion and trillion are interpreted quite differently. If you want a lot of info about these and other differences, go to this Wiki page.

### Text Handling

This functionality is encompassed in a static string extension class.

It made sense to me to make the outward-facing interface a string extension method. This is a simple method which normalizes the text and converts each text component into a numeric value. You may have already noticed that I'm using a decimal type. I'm doing this because of its ungodly max value which FAR exceeds the currently coded maximum possible value of 999 trillion.

There's really nothing special about this code. I merely replace some stuff, remove some stuff, and add some stuff so that the text is as normalized as I can reasonable make it. Once that's done, I convert each component "word" into its numeric equivalent by trying to find it in the numbers Dictionary. If I can't find even one of the component words in the dictionary, an exception is thrown. Once we're past the normalization/parsing part, we do the math to arrive at the numeric value.

UPDATE, 13 Feb 2015 - Apparently, everybody except the US interprets the value of billion and trillion incorrectly (grin). The difference is referred to as short-scale (US) or long-scale (GBR, et al). To appease the guys that mentioned it, I came up with a solution. I moved the numbers dictionary to its own class, and added a method here to adjust for long scale regions. This new method is called before we do anything else. I had neither the time nor desire to include all long-scale regions, so I left it as a very minor exercise to the programmer to add his/her 3-letter ISO region code to the longScaleRegions string. I also added some code to replace "AND" with a space, in case someone tries to do something like "one hundred and five".

```public static class ExtendString
{
private static IntList values;

{
string longScaleRegions = "GBR,";
if (longScaleRegions.Contains(RegionInfo.CurrentRegion.ThreeLetterISORegionName))
{
numbers["BILLION"]  = numbers["MILLION"] * numbers["MILLION"];
numbers["TRILLION"] = numbers["BILLION"] * numbers["MILLION"];
}
}

public static decimal Translate(this string text)
{
text = text.ToUpper().Trim();
string trimChars = "TH";
text = text.Replace("TY", "TY ");
text = text.Replace("-"," ").Replace("_"," ").Replace("."," ").Replace(",", " ");
text = text.Replace(" AND", " ");
if (text.EndsWith(trimChars))
{
text = text.TrimEnd(trimChars.ToArray());
}
text = text.Replace("  ", " ");

values = new IntList();
string[] parts = text.Split(' ');
foreach (string numberText in parts)
{

if (numbers.Keys.Contains(numberText))
{
}
else
{
throw new Exception("Not a number (might be spelled wrong)");
}
}
return values.NumericValue;
}
}
```

### The Math

I created a class derived from List<decimal> that contains the parsed numeric values, and performs operations on those values. This effectively separates and contains what I consider to be discrete functionality. It also goes a long way toward keeping the extension method free of clutter. Give then text "four hundred twenty nine thousand six hundred six", the contents of the list starts out like this after parsing:

• [0] 4
• [1] 100
• [2] 20
• [3] 9
• [4] 1000
• [5] 6
• [6] 100
• [7] 6

Given the domain of the problem, we should never really have many more that 15 or so elements in the list (more if you want to support more than 999 trillion), so we don't have to concern ourselves with the size of this list.

The first thing we do is break the list up into smaller chunks. This just makes it easier to manage. There's a"mini list" for each major value break, and are named to indicate their numeric purpose.

```private List<decimal> trillions = new List<decimal>();
private List<decimal> billions = new List<decimal>();
private List<decimal> millions = new List<decimal>();
private List<decimal> thousands = new List<decimal>();
private List<decimal> hundreds = new List<decimal>();
```

NOTE: Notice that the mini-lists don't inlcude anything over a trillion, and don't include the long-scale-specific terms. Feel free to add them if you want to.

When the text has been parsed, the extension method retrieves the IntList.NumericValue property. Several actions are taken in order to retrieve the expected numeric value.

```public decimal NumericValue
{
get
{
this.BuildMiniLists();
this.DoMiniMaths();
decimal value = this.Sum(x=>x);
return value;
}
}
```

Each of the mini lists are populated (if necessary). We do this by looking for the big value and copying it - and all of the receding values - to the appropriate mini-list.

UPDATE - 13 Feb 2015 - I changed the call to BuildMiniList to refer to the NumberDictionary values. This only made sense since you could be in a long-scale region where the values can change for some of the dictionary numbers.

```private void BuildMiniLists()
{
this.BuildMiniList(this.trillions, NumberDictionary.Numbers["TRILLION"]);
this.BuildMiniList(this.billions, NumberDictionary.Numbers["BILLION"]);
this.BuildMiniList(this.millions, NumberDictionary.Numbers["MILLION"]);
this.BuildMiniList(this.thousands, NumberDictionary.Numbers["THOUSAND"]);
this.BuildMiniList(this.hundreds, NumberDictionary.Numbers["HUNDRED"]);
}

private void BuildMiniList(List<decimal> list, decimal amount) {
// find the index of the specified amount
int index = this.IndexOf(amount);
// if we have an index if (index >= 0)
{
// copy the values from 0 to the found index into the
// specified mini list
decimal[] values;
values = new decimal[index+1];
this.CopyTo(0, values, 0, index+1);
// and remove those items from the parent list
for (int i = index; i >= 0; i--)
{
this.RemoveAt(i);
}
}
}
```

Next, math is performed on each mini list. The DoMath method is recursive, and iterates through the specified list looking for this to multiple or add. Using our example above, the thousands mini list will look like this:

• [0] 4
• [1] 100
• [2] 20
• [3] 9
• [4] 1000

The DoMath method below will build the appropriate value by reading each value and either adding or multiplying it into the resulting value. The processing will do this:

((4 * 100) + 20 + 9) * 1000) = 429000

p>After process, the mini list will look like this:

• [0] 0
• [1] 0
• [2] 0
• [3] 0
• [4] 429000
```private void DoMiniMaths()
{
this.DoMath(this.trillions, 0);
this.DoMath(this.billions,  0);
this.DoMath(this.millions,  0);
this.DoMath(this.thousands, 0);
this.DoMath(this.hundreds,  0);
}

private void DoMath(List<decimal> list, int lastBigIndex)
{
if (list.Count > 0)
{
decimal rollingValue = 0;
decimal[] big = new decimal[]{100,1000,1000000,1000000000,1000000000000};
for (int j = 0; j < list.Count; j++)
{
if (big.Contains(list[j]))
{
rollingValue = Math.Max(1, rollingValue) * list[j];
list[j] = rollingValue;
for (int i = lastBigIndex; i < j; i++)
{
list[i] = 0;
}
lastBigIndex = j;
if (j < list.Count - 1)
{
this.DoMath(list, lastBigIndex);
}
}
else
{
rollingValue += list[j];
}
}
}
}
```

After the math has been performed on each mini-list, we sum up the values inside each list, and add each sum back to the parent list.

```private void AddMiniValues()
{
}
```

After summing the mini-lists, the parent list should have the following contents:

• [0] 6
• [1] 100
• [2] 6
• [3] 429000

When the values in the parent list are summed, we arrive at the result of "429606". Now that we've written all of that code, I leave it as an exercise for the programmer to associate the resulting numeric value with the text for sorting.

## Using the Code

Usage goes something like this (from a console app):

```class Program
{
static void Main(string[] args)
{
Translate("twelfth");
Translate("First");
Translate("One Hundredth");
Translate("five thousand seven Hundred thirtysecond");
Translate("four hundred twenty nine thousand");
Translate("four hundred twenty nine thousand six hundred six");
Translate("fortyfive");
Translate("twenty seven million two hundred thirtyfour thousand one");
}

static void Translate(string text)
{
decimal value = text.Translate();
Console.WriteLine(string.Format("{0} - {1:#,##0}", text, value));
}
}
```

## Points of Interest

Nothing particularly wacky or unexpected occurred while writing this code, but it was interesting and allowed me to do some mental exercise outside my normal work-a-day life. I am programmer.

## History

• 13 Feb 2015 (B)  Added support for long-scale regions
• 13 Feb 2015 (A)  Formatted a code block and fixed some misspellings.
• 12 Feb 2015  Initial release.

## Share

 Software Developer (Senior) Paddedwall Software United States
I've been paid as a programmer since 1982 with experience in Pascal, and C++ (both self-taught), and began writing Windows programs in 1991 using Visual C++ and MFC. In the 2nd half of 2007, I started writing C# Windows Forms and ASP.Net applications, and have since done WPF, Silverlight, WCF, web services, and Windows services.

My weakest point is that my moments of clarity are too brief to hold a meaningful conversation that requires more than 30 seconds to complete. Thankfully, grunts of agreement are all that is required to conduct most discussions without committing to any particular belief system.

## You may also be interested in...

 Pro Pro

 First Prev Next
 Negative numbers Forogar 25-Oct-15 6:08 Forogar 25-Oct-15 6:08
 Four and twenty blackbirds baked in pie... Forogar 25-Oct-15 5:24 Forogar 25-Oct-15 5:24
 Re: Four and twenty blackbirds baked in pie... John Simmons / outlaw programmer28-Oct-15 6:19 John Simmons / outlaw programmer 28-Oct-15 6:19
 Re: Four and twenty blackbirds baked in pie... Forogar 28-Oct-15 8:54 Forogar 28-Oct-15 8:54
 My vote of 5 richard_x8623-Feb-15 5:05 richard_x86 23-Feb-15 5:05
 Re: My vote of 5 John Simmons / outlaw programmer23-Feb-15 6:57 John Simmons / outlaw programmer 23-Feb-15 6:57
 Great article skandland23-Feb-15 4:27 skandland 23-Feb-15 4:27
 Where is the code ? SAM LIVE22-Feb-15 22:43 SAM LIVE 22-Feb-15 22:43
 Re: Where is the code ? John Simmons / outlaw programmer23-Feb-15 3:37 John Simmons / outlaw programmer 23-Feb-15 3:37
 Re: Where is the code ? Pete O'Hanlon23-Feb-15 4:07 Pete O'Hanlon 23-Feb-15 4:07
 My vote of 5 Omar Gameel Salem21-Feb-15 19:52 Omar Gameel Salem 21-Feb-15 19:52
 Very nice article Sacha Barber19-Feb-15 22:32 Sacha Barber 19-Feb-15 22:32
 Formatting Kenneth Haugland13-Feb-15 0:53 Kenneth Haugland 13-Feb-15 0:53
 Re: Formatting John Simmons / outlaw programmer13-Feb-15 1:09 John Simmons / outlaw programmer 13-Feb-15 1:09
 Re: Formatting Kenneth Haugland13-Feb-15 1:20 Kenneth Haugland 13-Feb-15 1:20
 Re: Formatting John Simmons / outlaw programmer13-Feb-15 1:34 John Simmons / outlaw programmer 13-Feb-15 1:34
 Re: Formatting Kenneth Haugland13-Feb-15 2:53 Kenneth Haugland 13-Feb-15 2:53
 Wow Avijnata12-Feb-15 17:27 Avijnata 12-Feb-15 17:27
 Re: Wow Garth J Lancaster12-Feb-15 17:42 Garth J Lancaster 12-Feb-15 17:42
 Re: Wow Avijnata12-Feb-15 18:00 Avijnata 12-Feb-15 18:00
 Re: Wow PIEBALDconsult12-Feb-15 20:22 PIEBALDconsult 12-Feb-15 20:22
 Re: Wow John Brett12-Feb-15 22:02 John Brett 12-Feb-15 22:02
 Re: Wow John Simmons / outlaw programmer13-Feb-15 1:35 John Simmons / outlaw programmer 13-Feb-15 1:35
 Re: Wow John Simmons / outlaw programmer13-Feb-15 0:53 John Simmons / outlaw programmer 13-Feb-15 0:53
 Re: Wow Avijnata13-Feb-15 1:45 Avijnata 13-Feb-15 1:45
 Re: Wow John Simmons / outlaw programmer13-Feb-15 5:42 John Simmons / outlaw programmer 13-Feb-15 5:42
 Re: Wow Avijnata13-Feb-15 6:04 Avijnata 13-Feb-15 6:04
 What if my number is a billion zillion and 3 Slacker00712-Feb-15 10:53 Slacker007 12-Feb-15 10:53
 Re: What if my number is a billion zillion and 3 PIEBALDconsult12-Feb-15 11:09 PIEBALDconsult 12-Feb-15 11:09
 Re: What if my number is a billion zillion and 3 John Simmons / outlaw programmer12-Feb-15 11:33 John Simmons / outlaw programmer 12-Feb-15 11:33
 Re: What if my number is a billion zillion and 3 PIEBALDconsult12-Feb-15 11:36 PIEBALDconsult 12-Feb-15 11:36
 Re: What if my number is a billion zillion and 3 John Simmons / outlaw programmer12-Feb-15 11:38 John Simmons / outlaw programmer 12-Feb-15 11:38
 Re: What if my number is a billion zillion and 3 Kenneth Haugland12-Feb-15 11:46 Kenneth Haugland 12-Feb-15 11:46
 Re: What if my number is a billion zillion and 3 PIEBALDconsult12-Feb-15 12:19 PIEBALDconsult 12-Feb-15 12:19
 Re: What if my number is a billion zillion and 3 John Simmons / outlaw programmer13-Feb-15 4:28 John Simmons / outlaw programmer 13-Feb-15 4:28
 Re: What if my number is a billion zillion and 3 Chris Copeland12-Feb-15 14:13 Chris Copeland 12-Feb-15 14:13
 Re: What if my number is a billion zillion and 3 Steven Nicholas13-Feb-15 0:16 Steven Nicholas 13-Feb-15 0:16
 Re: What if my number is a billion zillion and 3 Pete O'Hanlon23-Feb-15 4:06 Pete O'Hanlon 23-Feb-15 4:06
 Re: What if my number is a billion zillion and 3 Kenneth Haugland12-Feb-15 11:42 Kenneth Haugland 12-Feb-15 11:42
 Re: What if my number is a billion zillion and 3 John Simmons / outlaw programmer13-Feb-15 6:12 John Simmons / outlaw programmer 13-Feb-15 6:12
 Re: What if my number is a billion zillion and 3 Slacker00713-Feb-15 6:15 Slacker007 13-Feb-15 6:15
 Last Visit: 31-Dec-99 19:00     Last Update: 20-Jan-17 0:24 Refresh 1