15,992,250 members
Articles / General Programming / Algorithms

# NumberParser

Rate me:
2 Dec 2017CPOL10 min read 15.4K   171   6   2
Library extending the .NET numeric support

## Introduction

NumberParser simplifies the usage of .NET numeric types, further maximises the `decimal` high precision and extends the default mathematical support.

Additionally, this library allows to deal with beyond-`double`-range values and manages all the errors internally without throwing exceptions.

NumberParser is the second part of FlexibleParser, a multi-purpose group of independent .NET parsing libraries (the first part in codeproject.com: UnitParser).

Note that I have also developed a Java version of this library. To know more about it, visit the corresponding page in my main site.

## Background

There are three main aspects of the default .NET management of numeric types which are relevant for NumberParser:

• It supports various numeric types under different constraints which aren’t immediately and fully compatible among each other.
NumberParser removes all the boundaries among numeric types via either a `dynamic` variable or a class simultaneously supporting various native types. Additionally, its defining structure `Value` (any type) * 10^`BaseTenExponent` (`int`) allows dealing with as big numbers as required.
• The `System.Math` methods are reasonably comprehensive, but certainly improvable.
The `Math2 `class of NumberParser includes adapted-to-NumberParser-classes versions of all the `System.Math` methods and custom approaches extending the default .NET functionalities.
• The high precision of the .NET `decimal` type isn’t always fully maximised.
The methods `Math2.PowDecimal` and `Math2.SqrtDecimal` rely on a custom exponentiation approach precisely meant to maximise the `decimal` precision.

## Code analysis

### Beyond individual numeric types: NumberX

NumberX is a generic designation for the classes `Number`, `NumberD`, `NumberO` and `NumberP`, which provide the basic conditions allowing NumberParser to accomplish the intended homogenisation of numeric types. All these classes share the following features:

• Defined by `Value`*10^`BaseTenExponent` and, consequently, supporting bigger than enough ranges.
• All the errors being managed internally and no exceptions are thrown.
• Implicitly convertible between each other and to their defining native types.

All these classes have their own specific characteristics, namely:

• `Number`. It is the lighter one and its `Value` is `decimal`.
• `NumberD`. Its `Value` is `dynamic` and, consequently, it can deal with any native numeric type.
• `NumberO`. Its defining characteristic is the `Others` public property, a collection of `NumberD` variables containing the native numeric types instructed by the user.
• `NumberP`. It can extract numeric information from strings and its `Value` is `dynamic`.

All these classes are defined according to their own codes, included inside clearly-tagged files and folders like Constructors_Number.cs (inside the Constructors folder) or Operations_Public_NumberD.cs (inside the Operations/Public folder). All the code dealing with the common parts relies on the lightest possible version: `Number` for `decimal`-based calculations and `NumberD` for any other scenario.

Below these lines, I am including and excerpt of the `NumberD` constructor code. It gives a quite good idea about how most of this part of the code looks like: some public properties automatically synchronised through the getters/setters; and a relevant number of constructors (where the aforementioned properties are populated in a compatible-with-that-synchronisation order) allowing to instantiate these classes in many different ways and, in the cases of 1-argument constructors, also implicitly converting them to other NumberX classes/natives types.

C#
```///<summary>
///<para>NumberD extends the limited decimal-only range of Number by supporting all the numeric
///types.</para>
///<para>It is implicitly convertible to Number, NumberO, NumberP and all the numeric types.</para>
///</summary>
public partial class NumberD
{
private dynamic _Value;
private Type _Type;
<summary><para>Numeric variable storing the primary value.</para></summary>
public dynamic Value
{
get { return _Value; }
set
{
Type type = ErrorInfoNumber.InputTypeIsValidNumeric(value);

if (type == null) _Value = null;
else
{
_Value = value;
if (_Value == 0) BaseTenExponent = 0;
}

if (Type != type) Type = type;
}
}
///<summary><para>Base-ten exponent complementing the primary value.</para></summary>
public int BaseTenExponent { get; set; }
///<summary><para>Numeric type of the Value property.</para></summary>
public Type Type
{
get { return _Type; }
set
{
if (Value != null && value != null)
{
if (Value.GetType() == value) _Type = value;
else
{
NumberD tempVar = new NumberD(Value, BaseTenExponent, value, false);

if (tempVar.Error == ErrorTypesNumber.None)
{
_Type = value;
BaseTenExponent = tempVar.BaseTenExponent;
Value = tempVar.Value;
}
//else -> The new type is wrong and can be safely ignored.
}
}
}
}
///<summary><para>Readonly member of the ErrorTypesNumber enum which best suits the current
///conditions.</para></summary>

///<summary><para>Initialises a new NumberD instance.</para></summary>
///<param name="type">Type to be assigned to the dynamic Value property. Only numeric types
///are valid.</param>
public NumberD(Type type)
{
Value = Basic.GetNumberSpecificType(0, type);
Type = type;
}

///<summary><para>Initialises a new NumberD instance.</para></summary>
///<param name="value">Main value to be used. Only numeric variables are valid.</param>
///<param name="baseTenExponent">Base-ten exponent to be used.</param>
public NumberD(dynamic value, int baseTenExponent)
{
Type type = ErrorInfoNumber.InputTypeIsValidNumeric(value);

if (type == null)
{
Error = ErrorTypesNumber.InvalidInput;
}
else
{
//To avoid problems with the automatic actions triggered by some setters, it is
//better to always assign values in this order (i.e., first BaseTenExponent, then
//Value and finally Type).
BaseTenExponent = baseTenExponent;
Value = value;
Type = type;
}
}

///<summary><para>Initialises a new NumberD instance.</para></summary>
///<param name="value">Main value to be used. Only numeric variables are valid.</param>
///<param name="type">Type to be assigned to the dynamic Value property. Only numeric types
///are valid.</param>
public NumberD(dynamic value, Type type)
{
NumberD numberD = ExtractValueAndTypeInfo(value, 0, type);

if (numberD.Error != ErrorTypesNumber.None)
{
Error = numberD.Error;
}
else
{
BaseTenExponent = numberD.BaseTenExponent;
Value = numberD.Value;
Type = type;
}
}

//etc.
}```

To know more about the NumberX implementations, you can visit the corresponding pages in varocarbas.com: https://varocarbas.com/flexible_parser/number_numberx/https://varocarbas.com/flexible_parser/number_numbero/ and https://varocarbas.com/flexible_parser/number_numberp/.

### Basic operations and comparisons between NumberX instances

The next logical step after creating the NumberX classes is to ease its usage under the most common scenarios which, for numeric types, are basic arithmetic and comparison operations.

My approach has been to perform all these actions via operator overloading for every class what, for example, allows to do something like `NumberD result = new NumberD(1.2345) + new NumberD(5555);`. In all the NumberX classes, the basic arithmetic (+, -, * and /) and comparison (==, !=, >, >=, <, <=) operators are overloaded.

The aforementioned operations are expected to be performed between instances of the same NumberX class. Or, in other words, the implicit NumberX conversions aren’t applicable when dealing with overloaded operations. For example, `new NumberD(567) * new NumberP("12.3")` is wrong, but not `new NumberD(567) * (NumberD)new NumberP("12.3")`. The same rules aren’t applicable to implicit conversions of native natives and that's why `new Number(987.6m) + 777m` is fine.

The aforementioned limitation is provoked by the `dynamic` type peculiarities. The only way to avoid the current errors (i.e., ambiguous determination of the NumberX class to be used) would have been to expressly overload all the possible combinations between NumberX classes. Implementing such an eventuality was never an option because it would have provoked an unreasonable increase of the code size and resources associated with the NumberX classes, what would have had a relevant negative impact on their performance. Doing all that just to accomplish the irrelevant goal of avoiding a cast under very specific conditions wouldn't have made too much sense.

As part of FlexibleParser, NumberParser relies on the same default assumptions than all the other parts and, in case of incompatibility (e.g., different NumberX class or different `Value` types), the first element starting from the top left will always be preferred.

The most important parts of the code dealing with all this are the following:

• Operations/Public folder. All the method/operator overloads and implicit conversions (i.e., calling the corresponding 1-argument constructor) for all the NumberX classes are included here.
• Operations/Private folder. It contains most of the internal resources used by the aforementioned public resources. Note that one of these files (Operations_Private_Managed.cs) contains an adapted version of the managed operations discussed in the article about UnitParser.
• Conversions folder. As far as all the NumberX classes have to be able to undistinguishedly deal with different numeric types, conversions are also closely related to basic operations. In any case, bear in mind that all these are only-errors-if-required custom conversions adapting native types to the NumberX format rather than standard ones between native types. For example, no information is lost when converting an integer like 100000000 to `byte` because all the excess beyond the maximum `byte` range is stored in the associated `BaseTenExponent`.

A descriptive code of the conversion part might be the following one:

C#
```private static Number ModifyValueToFitType(Number number, Type target, decimal targetValue)
{
decimal sign = 1m;
if (number.Value < 0)
{
sign = -1m;
number.Value *= sign;
}

if (!Basic.AllDecimalTypes.Contains(target))
{
number.Value = Math.Round(number.Value, MidpointRounding.AwayFromZero);
}

targetValue = Math.Abs(targetValue);
bool increase = (number.Value < targetValue);

while (true)
{
if (number.Value == targetValue) break;
else
{
if (increase)
{
if
(
number.Value > Basic.AllNumberMinMaxPositives
[
typeof(decimal)
]
[1] / 10m
)
{ break; }

number.Value *= 10;
number.BaseTenExponent--;
if (number.Value > targetValue) break;
}
else
{
if
(
number.Value < Basic.AllNumberMinMaxPositives
[
typeof(decimal)
]
[0] * 10m
)
{ break; }

number.Value /= 10;
number.BaseTenExponent++;
if (number.Value < targetValue) break;
}
}
}

number.Value *= sign;

return number;
}```

### Overview of Math2 methods

After setting a group of classes homegenising the management of numeric types and all the basic comparisons/operations among them, extending their mathematical support seems the next logical step. In the .NET Framework, the main in-built mathematical methods are stored under `System.Math` and its NumberParser equivalent is `Math2`.

There is a first group of `Math2` methods which are just NumberX-adapted versions of all the `System.Math` ones. Each of them delivers exactly the same result than the corresponding original version. Its whole point is to facilitate the usage of NumberX instances with the most common mathematical functionalities. Even the default support (e.g., `double` range in most of the cases) is being respected and (internally-managed) errors are triggered regardless of the fact that the corresponding NumberX class can deal with these conditions or not.

Below these lines, you can find a descriptive excerpt of this part of the code included in Math2_Private_Existing.cs:

C#
```private delegate double Method1Arg(double value);
private delegate double Method2Arg(double value1, double value2);

private static Dictionary<ExistingOperations, Method1Arg> AllMathDouble1 =
new Dictionary<ExistingOperations, Method1Arg>()
{
{ ExistingOperations.Acos, Math.Acos }, { ExistingOperations.Asin, Math.Asin},
{ ExistingOperations.Atan, Math.Atan }, { ExistingOperations.Cos, Math.Cos },
{ ExistingOperations.Cosh, Math.Cosh }, { ExistingOperations.Exp, Math.Exp },
{ ExistingOperations.Log, Math.Log }, { ExistingOperations.Log10, Math.Log10 },
{ ExistingOperations.Sin, Math.Sin }, { ExistingOperations.Sinh, Math.Sinh },
{ ExistingOperations.Sqrt, Math.Sqrt }, { ExistingOperations.Tan, Math.Tan },
{ ExistingOperations.Tanh, Math.Tanh }
};

private static Dictionary<ExistingOperations, Method2Arg> AllMathDouble2 =
new Dictionary<ExistingOperations, Method2Arg>()
{
{ ExistingOperations.Atan2, Math.Atan2 },
{ ExistingOperations.IEEERemainder, Math.IEEERemainder },
{ ExistingOperations.Log, Math.Log }, { ExistingOperations.Pow, Math.Pow }
};

private static NumberD PerformOperationOneOperand(NumberD n, ExistingOperations operation)
{
NumberD n2 = AdaptInputsToMathMethod(n, GetTypesOperation(operation), operation);
if (n2.Error != ErrorTypesNumber.None) return new NumberD(n2.Error);

try
{
return ApplyMethod1(n2, operation);
}
catch
{
return new NumberD(ErrorTypesNumber.NativeMethodError);
}
}

private static NumberD PerformOperationTwoOperands(NumberD n1, NumberD n2, ExistingOperations operation)
{
NumberD[] ns = CheckTwoOperands
(
new NumberD[] { n1, n2 }, operation
);
if (ns[0].Error != ErrorTypesNumber.None) return ns[0];

try
{
return ApplyMethod2(ns[0], ns[1], operation);
}
catch
{
return new NumberD(ErrorTypesNumber.NativeMethodError);
}
}

private static NumberD[] CheckTwoOperands(NumberD[] ns, ExistingOperations operation)
{
ns = OrderTwoOperands(ns);

for (int i = 0; i < ns.Length; i++)
{
(
ns[i], (i == 0 ? GetTypesOperation(operation) : new Type[] { ns[0].Type }),
operation
);
if (ns[i].Error != ErrorTypesNumber.None)
{
return new NumberD[] { new NumberD(ns[i].Error) };
}
}

return ns;
}```

The `Math2` class also includes the following group of custom mathematical methods which I have developed from scratch:

• GetPolynomialFit/ApplyPolynomialFit. They calculate the 2nd degree polynomial fit from a set of X/Y values and apply it to estimate what Y2 is associated with the X2 input.
• Factorial. It calculates the factorial of positive integers smaller than 100000.
• RoundExact/TruncateExact. These methods appreciably extend the in-built .NET rounding/truncating functionalities. They allow to focus the rounding/truncating actions on the integer/decimal parts and, for example, return 123 or 124 or 123.6 from the input 123.567.
• PowDecimal/SqrtDecimal. It is discussed in the next subsection.

The most interesting code is the one dealing with `RoundExact`/`TruncateExact` and this is a descriptive sample of it:

C#
```private static decimal RoundInternalAfterZeroes(decimal d, int digits, RoundType type, decimal d2, int zeroCount)
{
if (digits < zeroCount)
{
//Cases like 0.001 with 1 digit or 0.0001 with 2 digits can reach this point.
//On the other hand, something like 0.001 with 2 digits requires further analysis.
return Math.Floor(d) +
(
type != RoundType.AlwaysAwayFromZero ? 0m :
1m / Power10Decimal[digits]
);
}

//d3 represent the decimal part after all the heading zeroes.
decimal d3 = d2 * Power10Decimal[zeroCount];
d3 = DecimalPartToInteger(d3 - Math.Floor(d3), 0, true);
int length3 = GetIntegerLength(d3);

digits -= zeroCount;
if (digits == 0)
{
//In a situation like 0.005 with 2 digits, the number to be analysed would be
//05 what cannot be (i.e., treated as 5, something different). That's why, in
headingBit = 2; //2 avoids the ...ToEven types to be misinterpreted.
d3 = headingBit * Power10Decimal[length3] + d3;
digits = 0;
}

decimal output =
(
RoundExactInternal(d3, length3 - digits, type)
/ Power10Decimal[length3]
)

return Math.Floor(d) +
(
output == 0m ? 0m :
output /= Power10Decimal[zeroCount]
);
}```

To know more about the Math2 methods, you can visit the corresponding pages in varocarbas.com: https://varocarbas.com/flexible_parser/number_native/ and https://varocarbas.com/flexible_parser/number_custom/.

### Math2.PowDecimal and Math2.SqrtDecimal

The in-built .NET exponentiation methods are meant to maximise the floating-point peculiarities (`double` type); what implies that most of the efforts are focused on delivering reasonably accurate results as quickly as possible. Another relevant issue is that the specific implementations are private and, in any case, very unlikely to be easily adapted to non-floating-point scenarios.

Almost all the programming languages rely on floating-point approaches to deal with decimal numeric types. The `decimal` type in .NET is one of the few exceptions and this is precisely why I had to develop a custom approach to fully maximise its defining high precision. I will only be referring to the implementation dealing with fractional exponents, because accounting for all the other scenario (e.g., integer or negative exponents) is quite trivial.

I relied on the very fast, reliable and brand-new (LOL) Newton-Raphson method. The main limitation of this approach is that its convergence speed is highly conditioned by the fact of providing a good enough first guess. A bad initial guess isn’t particularly influential under not too demanding conditions, but it is extremely relevant when trying to accomplish the intended maximisation of `decimal` precision. Note that this type can accurately deal with up to 28 decimal positions, what implies performing operations and comparing values within a precision of up to 10^-28. In other words, getting stuck into a real or practically-speaking (i.e., taking unacceptably long) infinite loop is relatively easy unless the initial guess is good enough.

So, the most relevant part of the `Math2.PowDecimal`/`Math2.SqrtDecimal` code is the approach with which I came up to ensure good enough first guesses for the Newton-Raphson method. By bearing in mind that it calculates `n` roots, that the perfect guess is the actual root and that the `n` root of `x` has to be more or less consistent with a trend defined by `n-1` root of `x` and `n+1` root of `x`, I generated a relevant number of couples `x` vs. `n` root of `x` for a relevant number of different `n` values. Then, I looked for the underlying trends, summarised these conclusions and created equations replicating those behaviours within more or less big ranges. Note that all this part is only concerned about dealing with positive integers `n` and 10-divisible `x`.

Although the current approach is already reasonably quick and reliable, it is still a first version which I am expecting to further improve at a later point. That is the reason why this part of the code doesn't include too many comments: it is still work in progress. In any case, here you have a descriptive sample of it:

C#
```private static decimal GetSmallValueBase10Guess(decimal value, decimal n)
{
decimal[] vals = new decimal[]
{
0.4605m, 0.5298m, 0.5704m, 0.5991m, 0.6215m
};

int index = (int)(value / 100m);
decimal ratio = (value - index * 100m) / 100m;
index--;

decimal outVal = vals[index];
if (ratio != 1m && index < 4)
{
outVal = vals[index] + ratio * (vals[index + 1] - vals[index]);
}

return 1m + outVal / Power10Decimal[GetIntegerLength(n) - 2];
}

private static decimal GetGenericBase10Guess(decimal value, decimal n)
{
bool small = false;
decimal value2 = GetInverseValue(value);
if (value2 != -1m)
{
small = true;
value = value2;
}

decimal outVal = 1m;
int exponent = GetIntegerLength(n) - 1;

if (value >= 500m)
{
decimal ratio = value / 500m;
if (ratio >= 100m)
{
exponent--;
if (ratio >= 1000m)
{
int length = GetIntegerLength(ratio);
//4 -> 0.25
//5 -> 0.5
//6 -> 0.75
//7 -> 1
//8 -> 1.25
//...
decimal rem = length % 4;
outVal = length / 4 + 0.25m * (rem + 1m);
}
}
else if (ratio >= 10m) outVal *= 9m;
else if (ratio >= 1m) outVal *= 5m;
}

return
(
!small ? 1m + outVal / Power10Decimal[exponent] :
(1m - 1m / Power10Decimal[exponent]) + outVal / Power10Decimal[exponent + 1]
);
}```

I have written a much more detailed analysis of this implementation in https://varocarbas.com/fractional_exponentiation/ (PDF).

## Using the code

NumberParser (inside the `FlexibleParser` namespace) provides a common framework to deal with all the .NET numeric types. It relies on the following four classes (NumberX):

• `Number` only supports the `decimal` type.
• `NumberD` can support any numeric type via `dynamic`.
• `NumberO` can support different numeric types simultaneously.
• `NumberP` can parse numbers from strings.
```//1.23m (decimal).
Number number = new Number(1.23m);

//123 (int).
NumberD numberD = new NumberD(123);

//1.23 (decimal). Others: 1 (int) and ' ' (char).
NumberO numberO = new NumberO(1.23m, new Type[] { typeof(int), typeof(char) });

//1 (long).
NumberP numberP = new NumberP("1.23", new ParseConfig(typeof(long)));```

### Common features

All the NumberX classes have various characteristics in common.

• Defined according to the fields `Value` (`decimal` or `dynamic`) and `BaseTenExponent` (`int`). All of them support ranges beyond [-1, 1] * 10^2147483647.
• Most common arithmetic and comparison operator support.
• Errors managed internally and no exceptions thrown.
• Numerous instantiating alternatives. Implicitly convertible between each other and to related types.
```//12.3*10^456 (decimal).
Number number = new Number(12.3m, 456);

//123 (int).
Number numberD =
(
new NumberD(123) < (NumberD)new Number(456) ?
//123 (int).
new NumberD(123.456, typeof(int)) :
//123.456 (double).
new NumberD(123.456)
);

//Error (ErrorTypesNumber.InvalidOperation) provoked when dividing by zero.
NumberO numberO = new NumberO(123m, OtherTypes.IntegerTypes) / 0m;

//1234*10^5678 (decimal).
NumberP numberP = (NumberP)"1234e5678";```

### Math2 class

This class includes all the NumberParser mathematical functionalities.

#### Custom functionalities

• `PowDecimal`/`SqrtDecimal` whose `decimal`-based algorithms are more precise than the `System.Math` versions. The whole varocarbas.com Project 10 explains their underlying calculation approach.
• `RoundExact`/`TruncateExact` can deal with multiple rounding/truncating scenarios not supported by the native methods.
• `GetPolynomialFit`/`ApplyPolynomialFit` allow to deal with second degree polynomial fits.
• `Factorial` calculates the factorial of any integer number up to 100000.
```//158250272872244.91791560253776 (decimal).
Number number = Math2.PowDecimal(123.45m, 6.789101112131415161718m);

//123000 (decimal).
Number number = Math2.RoundExact
(
123456.789m, 3, RoundType.AlwaysToZero,
RoundSeparator.BeforeDecimalSeparator
);

//30 (decimal).
NumberD numberD = Math2.ApplyPolynomialFit
(
Math2.GetPolynomialFit
(
new NumberD[] { 1m, 2m, 4m }, new NumberD[] { 10m, 20m, 40m }
)
, 3
);

//3628800 (int).
NumberD numberD = Math2.Factorial(10);```

#### Native methods

`Math2` also includes `NumberD`-adapted versions of all the `System.Math` methods.

```//158250289837968.16 (double).
NumberD numberD = Math2.Pow(123.45, 6.789101112131415161718);

//4.8158362157911885 (double).
NumberD numberD = Math2.Log(123.45m);```

### Further code samples

The test application includes a relevant number of descriptive code samples.

## Points of interest

User-friendly format allowing to easily deal with all the .NET numeric types without having to worry about conversions or range limitations.

Extension of the default mathematical support with a major focus on maximising the high precision associated with `decimal`.

It can deal with as big as required numbers and manages all the errors internally.

## Authorship

I, Alvaro Carballo Garcia, am the sole author of this article and all the referred NumberParser/FlexibleParser resources like code or documentation.