12,956,197 members (57,431 online)
Technical Blog
Add your own
alternative version

#### Stats

8.7K views
Posted 26 Jan 2012

# Strtod in C# – Part 1: The Specification

, 26 Jan 2012 CPOL
 Rate this:
Please Sign up or sign in to vote.
Dealing with floating point numbers is tricky so I’m probably going to mess it up on some corner cases, so to limit the damage we’re going to create some tests that our function must be able to handle.

As mentioned, parsing the points in an SVG polygon would be a lot easier (and quicker?) if we had the `strtod` function in C#. Well, let’s give it a go! Now, dealing with floating point numbers is tricky so I’m probably going to mess it up on some corner cases, so to limit the damage, we’re going to create some tests that our function must be able to handle. These tests will also help create our specification, hopefully!

## Precision

First of all, how accurate do we have to be? Well, §4.3 states:

a number has the capacity for at least a single-precision floating point number

It also goes on to mention that it’s preferable that double-precision be used for calculations, so we may as well parse to double to aid in calculations, knowing that any value stored in the SVG should fall within a valid range.

So, how many digits are we going to parse? What Every Computer Scientist Should Know About Floating-Point Arithmetic by David Goldberg is by far the best work on trying to understand floating point numbers. I don’t claim to understand it all, however, in the precision section, he mentions that 17 digits are enough to recover a double precision binary number. However, that’s not quite correct – 17 significant digits are required, as `12345678901234567` and `000000000012345678901234567` are the same number.

Here’s what we’ll test (also making sure that the whole `string` is read):

 Input Expected `12345678901234567` `12345678901234567` `000000000012345678901234567` `12345678901234567` `12345678901234567890` `12345678901234567000` `1.2345678901234567` `1.2345678901234567` `0.00000000012345678901234567` `0.00000000012345678901234567` `1.00000000012345678901234567` `1.0000000001234567` `1234567890.00000000001234567` `1234567890`

## Underflow/Overflow

What happens when the value is too small (as in very close to zero, not as in a negative number is outside of the valid range) or too large to represent?

`System.Double.Parse` makes numbers that are too small to be represented by `double` silently underflow into zero. This also happens when converting a very small `double` to `float`. However, if the number is outside the range of `double`, then `System.Double.Parse` throws a `System.OverflowException`. This doesn’t make sense to me, especially since casting a big `double` to `float` will convert it to infinity. In this situation, `strtod` returns `HUGE_VAL` and this is the route I’ll take – numbers that are too big to fit inside the range of a `double` will be returned as +/- infinity and numbers that are very close to zero that `double` cannot represent will be truncated to zero.

Therefore, our tests will make sure the following happen (again making sure all input is read):

 Input Expected `+4.9406564584124654E-324` `4.9406564584124654E-324` `+1.7976931348623157E+308` `1.7976931348623157E+308` `-4.9406564584124654E-324` `-4.9406564584124654E-324` `-1.7976931348623157E+308` `-1.7976931348623157E+308` `+1E-325` `0` `+1E+309` Infinity `-1E-325` `0` `-1E+309` -Infinity

## Infinity/Not a Number

There’s an interesting point to take into consideration when parsing – the SVG specification seems to only allows numbers, yet in XML, INF, -INF and NaN are valid.

When parsing, we’ll try to be as flexible as possible, so will allow "`Inf`", "`Infinity`" and "`NaN`" (all case-insensitive).

## Valid Formats

The number section of the standard gives the following EBNF grammar for a valid number:

```integer ::= [+-]? [0-9]+
number  ::= integer ([Ee] integer)?
| [+-]? [0-9]* "." [0-9]+ ([Ee] integer)?```

What’s interesting about this is that numbers with a trailing decimal point are invalid (i.e., `0.` doesn’t match the grammar). We’ll assume that’s an oversight and allow it (`System.Double.Parse` and `strtod` have no problems with it.) However, we need to be careful that a single decimal point is not parsed.

Testing this is a bit more involved, as `strtod` will parse as much of the input as it can, so while `0e++0` looks invalid, our function should be able to parse the first zero and then stop when it gets to the `e`. To test this, we therefore need to make sure that our function does not consume the whole `string`, just the first few characters.

## The Code

Think that’s all for now. Here are the test cases; the actual class will follow later.

Eventually, I’d like to allow for different culture settings, but for now I’m concentrating on the SVG spec.

## License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

## About the Author

 United Kingdom
No Biography provided

## Comments and Discussions

 -- There are no messages in this forum --
Permalink | Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.170525.1 | Last Updated 26 Jan 2012
Article Copyright 2012 by Samuel Cragg
Everything else Copyright © CodeProject, 1999-2017
Layout: fixed | fluid