Article

# Floating point utilites

, 17 Nov 2003
 Rate this:
A set of floating point utilities

## Introduction

This is a set of floating point utilities. 16 functions are provided:

• `FloatsEqual` Testing float's for equality. When the operands of operators == and != are some form of floating type (float, double, or long double). Testing for equality between two floating point quantities is suspect because of round-off error and the lack of perfect representation of fractions.
• `Round` Rounds a number to a specified number of digits.
• `RoundDouble` Similar to `Round() `above, but uses double's instead of float's.
• `SigFig` Rounds a number to a specified number of significant figures.
• `FloatToText` Converts a floating point number to ascii (without the appended zeros)
• `CalcBase` This function wraps the given number so that it remains within its base. Returns a number between 0 and base - 1. For example if the base given was 10 and the parameter was 10 it would wrap it so that the number is now a 0. If the number given were -1, then the result would be 9. This function can also be used everywhere where a number needs to be kept within a certain range, for example angles (0 and 360) and radians (0 to TWO_PI).
• `CalcBaseFloat` Same as `CalcBase() `above, except using floats
• `Angle` Make sure angle is between 0 and 359
• `LineLength` Calculates the length of a line between the following two points
• `RoundValue` Converts a floating point value to an integer, very fast
• `FloatToInt` Converts a floating point value to an integer, very fast
• `FP_INV` This is about 2.12 times faster than using 1.0f / n
• `CheckRange` Makes sure Value is within range
• `CheckMin` Makes sure Value is >= Min
• `CheckMax` Makes sure Value is <= Max
• `Divide` Performs a safe division

The credit for `Round()` and `RoundDouble()` goes to Josef Wolfsteiner.

## Modifications

• Simon Hughes, 18th November 2003.
• Updated `SigFig() `to check for 0.0 being passed in as the value, as log10f(0) returns NaN
• Added `FloatsEqual() `function
• Added `CalcBase() `function
• Added `CalcBaseFloat() `function
• Added `Angle() `function
• Added `LineLength() `function
• Modified `RoundValue() `function so it is much faster
• Added `FloatToInt() `
• Added `FP_INV `for very fast 1/n calculations
• Added `CheckRange(), CheckMin(), CheckMax(), Divide() `template functions

```// Testing float's for equality. When the operands of operators == and != are
// some form of floating type (float, double, or long double).  Testing for
// equality between two floating point quantities is suspect because of
// round-off error and the lack of perfect representation of fractions.
// The value here is for testing two float values are equivalent within the
// range shown here. The implementation is:
//     if(fabs(a - b) > float_equality) ...
// See FloatsEqual(a, b) function
#define float_equality 1.0e-20f
bool FloatsEqual(const float &a, const float &b);

// Rounds a number to a specified number of digits.
// Number is the number you want to round.
// Num_digits specifies the number of digits to which you want to round number.
// If num_digits is greater than 0, then number is rounded to the
// specified number of decimal

places.
// If num_digits is 0, then number is rounded to the nearest integer.
// Examples
//        ROUND(2.15, 1)        equals 2.2
//        ROUND(2.149, 1)        equals 2.1
//        ROUND(-1.475, 2)    equals -1.48
float Round(const float &number, const int num_digits);
double RoundDouble(double doValue, int nPrecision);

// Rounds X to SigFigs significant figures.
// Examples
//        SigFig(1.23456, 2)        equals 1.2
//        SigFig(1.23456e-10, 2)    equals 1.2e-10
//        SigFig(1.23456, 5)        equals 1.2346
//        SigFig(1.23456e-10, 5)    equals 1.2346e-10
//        SigFig(0.000123456, 2)    equals 0.00012
float SigFig(float X, int SigFigs);

// Converts a floating point number to ascii (without the appended 0's)
// Rounds the value if nNumberOfDecimalPlaces >= 0
CString FloatToText(float n, int nNumberOfDecimalPlaces = -1);

// This function wraps the given number so that it remains within its
// base. Returns a number between 0 and base - 1.
// For example if the base given was 10 and the parameter was 10 it
// would wrap it so that the number is now a 0. If the number given
// were -1, then the result would be 9. This function can also be
// used everywhere where a number needs to be kept within a certain
// range, for example angles (0 and 360) and radians (0 to TWO_PI).
int CalcBase(const int base, int num);
// Same as CalcBase() above, except using floats
float CalcBaseFloat(const float base, float num);
// Make sure angle is between 0 and 359
int Angle(const int &angle);

// Calculates the length of a line between the following two points
float LineLength(const CPoint &point1, const CPoint &point2);

//lint -save -e*
// Converts a floating point value to an integer, very fast.
inline int RoundValue(float param)
{
// Uses the FloatToInt functionality
int a;
int *int_pointer = &a;

__asm  fld  param
__asm  mov  edx,int_pointer
__asm  FRNDINT
__asm  fistp dword ptr [edx];

return a;
}
//lint -restore

// At the assembly level the recommended workaround for the second
// FIST bug is the same for the first;
// inserting the FRNDINT instruction immediately preceding the
// FIST instruction.
// lint -e{715}
// Converts a floating point value to an integer, very fast.
inline void FloatToInt(int *int_pointer, const float &f)
{
__asm  fld  f
__asm  mov  edx,int_pointer
__asm  FRNDINT
__asm  fistp dword ptr [edx];
}

// This is about 2.12 times faster than using 1.0f / n
// r = 1/p
#define FP_INV(r,p) \
{ \
int _i = 2 * 0x3F800000 - *(int *)&(p); \
(r) = *(float *)&_i; \
(r) = (r) * (2.0f - (p) * (r)); \
}

// Makes sure Var is within range
template<CLASS T>
void CheckRange(T &Var, const T &Min, const T &Max)
{
if(Var < Min)
Var = Min;
else
if(Var > Max)
Var = Max;
}

// Makes sure Var is >= Min
template<CLASS T>
void CheckMin(T &Var, const T &Min)
{
if(Var < Min)
Var = Min;
}

// Makes sure Var is <= Max
template<CLASS T>
void CheckMax(T &Var, const T &Max)
{
if(Var > Max)
Var = Max;
}

// Performs a safe division. Checks that b is not zero before division.
template<CLASS T>
inline T Divide(const T &a, const T &b)    ```

## Source code

```// Rounds a number to a specified number of digits.
// Number is the number you want to round.
// Num_digits specifies the number of digits to which you want
// to round number.
// If num_digits is greater than 0, then number is rounded
// to the specified number of decimal

places.
// If num_digits is 0, then number is rounded to the nearest integer.
// Examples
//        ROUND(2.15, 1)        equals 2.2
//        ROUND(2.149, 1)        equals 2.1
//        ROUND(-1.475, 2)    equals -1.48
float Round(const float &number, const int num_digits)
{
float doComplete5i, doComplete5(number * powf(10.0f, (float) (num_digits + 1)));

if(number < 0.0f)
doComplete5 -= 5.0f;
else
doComplete5 += 5.0f;

doComplete5 /= 10.0f;
modff(doComplete5, &doComplete5i);

return doComplete5i / powf(10.0f, (float) num_digits);
}

double RoundDouble(double doValue, int nPrecision)
{
static const double doBase = 10.0;
double doComplete5, doComplete5i;

doComplete5 = doValue * pow(doBase, (double) (nPrecision + 1));

if(doValue < 0.0)
doComplete5 -= 5.0;
else
doComplete5 += 5.0;

doComplete5 /= doBase;
modf(doComplete5, &doComplete5i);

return doComplete5i / pow(doBase, (double) nPrecision);
}

// Rounds X to SigFigs significant figures.
// Examples
//        SigFig(1.23456, 2)        equals 1.2
//        SigFig(1.23456e-10, 2)    equals 1.2e-10
//        SigFig(1.23456, 5)        equals 1.2346
//        SigFig(1.23456e-10, 5)    equals 1.2346e-10
//        SigFig(0.000123456, 2)    equals 0.00012
float SigFig(float X, int SigFigs)
{
if(SigFigs < 1)
{
ASSERT(FALSE);
return X;
}

// log10f(0) returns NaN
if(X == 0.0f)
return X;

int Sign;
if(X < 0.0f)
Sign = -1;
else
Sign = 1;

X = fabsf(X);
float Powers = powf(10.0f, floorf(log10f(X)) + 1.0f);

return Sign * Round(X / Powers, SigFigs) * Powers;
}

// Converts a floating point number to ascii (without the appended 0's)
// Rounds the value if nNumberOfDecimalPlaces >= 0
CString FloatToText(float n, int nNumberOfDecimalPlaces)
{
CString str;

if(nNumberOfDecimalPlaces >= 0)
{
int decimal, sign;
char *buffer = _fcvt((double)n, nNumberOfDecimalPlaces, &decimal, &sign);

CString temp(buffer);

// Sign for +ve or -ve
if(sign != 0)
str = "-";

// Copy digits up to decimal point
if(decimal <= 0)
{
str += "0.";
for(; decimal < 0; decimal++)
str += "0";
str += temp;
} else {
str += temp.Left(decimal);
str += ".";
str += temp.Right(temp.GetLength() - decimal);
}
} else {
str.Format("%-g", n);
}

// Remove appended zero's. "123.45000" become "123.45"
int nFind = str.Find(".");
if(nFind >= 0)
{
int nFinde = str.Find("e");    // 1.0e-010 Don't strip the ending zero
if(nFinde < 0)
{
while(str.GetLength() > 1 && str.Right(1) == "0")
str = str.Left(str.GetLength() - 1);
}
}

// Remove decimal point if nothing after it. "1234." becomes "1234"
if(str.Right(1) == ".")
str = str.Left(str.GetLength() - 1);

return str;
}

// Testing float's for equality. When the operands of operators == and != are
// some form of floating type (float, double, or long double).  Testing for
// equality between two floating point quantities is suspect because of
// round-off error and the lack of perfect representation of fractions.
// The value here is for testing two float values are equivalent within the
// range as specified by float_equality.
bool FloatsEqual(const float &a, const float &b)
{
return (fabs(a - b) <= float_equality);
}

// This function wraps the given number so that it remains within its
// base. Returns a number between 0 and base - 1.
// For example if the base given was 10 and the parameter was 10 it
// would wrap it so that the number is now a 0. If the number given
// were -1, then the result would be 9. This function can also be
// used everywhere where a number needs to be kept within a certain
// range, for example angles (0 and 360) and radians (0 to TWO_PI).
int CalcBase(const int base, int num)
{
if(num >= 0 && num < base)
return num;    // No adjustment neccessary

if(num < 0)
{
num %= base;
num += base;
} else {
num %= base;
}

return num;
}

// Same as CalcBase() above, except using floats
float CalcBaseFloat(const float base, float num)
{
if(num >= 0.0f && num < base)
return num;    // No adjustment neccessary

if(num < 0.0f)
return fmodf(num, base) + base;
return fmodf(num, base);
}

// Make sure angle is between 0 and 359
int Angle(const int &angle)
{
return CalcBase(360, angle);
}

// Calculates the length of a line between the following two points
float LineLength(const CPoint &point1, const CPoint &point2)
{
const CPoint dist(point1 - point2);
return sqrtf(float((dist.x * dist.x) + (dist.y * dist.y)));
}```

## Share

Software Developer (Senior) www.ByBox.com
United Kingdom
C++ and C# Developer for 21 years. Microsoft Certified.

UK Senior software developer / team leader.

I've been writing software since 1985. I pride myself on designing and creating software that is first class. That means it has to be fast, scalable, and with good use of design patterns.

I have done everything from risk analysis and explosion modelling, banking systems, to highly scalable multi-threaded arrival and departure screens in many leading airports, to state of the art wireless warehouse systems.

 First PrevNext
 Ok, but special numbers? MNorzagaray 25-Apr-13 8:00
 conversion from float to double Roman Tarasov 1-Dec-09 2:48
 Re: conversion from float to double Simon Hughes 1-Dec-09 4:53
 IMHO equality test, done this way, is rather arbitrary. CPallini 31-Dec-07 10:49
 Faster Round()? Yap Chun Wei 22-Jun-06 20:27
 Re: Faster Round()? [modified] Simon Hughes 24-Jun-06 13:11
 Re: Faster Round()? [modified] Yap Chun Wei 25-Jun-06 14:52
 Re: Faster Round()? Simon Hughes 26-Jun-06 8:42
 Re: Faster Round()? bkrahmer 26-Feb-10 12:36
 Re: Faster Round()? kanbang 4-Aug-10 16:05
 Re: Faster Round()? Hoornet93 22-Oct-07 21:04
 need help ravirevolt 10-Mar-06 18:15
 Double to Float Question T. Kulathu Sarma 28-Nov-03 11:38
 Re: Double to Float Question Nayan Choudhary 1-Sep-04 22:25
 Visual C++ float to string conversion problem Ed Storey 8-Dec-00 5:47
 Hello,   I am just beginning with Visual C++ building a simple dialog program to run some calculations and I ran into a little bit of a wall. I will be inputting a couple floating point values and then hit calculate and the program will perform some calculations and output the data to another control box. My problem is:   1) I get theinput no problem 2) I convert it into a floating point 3) I do the calculation   //the problem is here!! 4) once the data is calculated I want to output it. However the problem lies in converting the new calculated float back to a string in borland you have FloatToStrF and it does it no problem However here in visual C++ I have not found a routine or function that does this to my requirements. My question is how do I take the float value and return it to the edit control box as a string. I might just not be doing it correctly to begin with thus the problem or confusion but here is a simple break down. I rewrote this a little using distance speed and time I figured this would be better then giving you my program with all the variables (its for a robots arm movement) //on hitting calculate inputs distance and speed // outputs the time it will take void CExoSpinDlg::OnCalc() { // these two just to store the value of edit control boxes CString someText; CString someText2; //edit control box 1 is distance m_distance.GetWindowText(someText); // edit control box 2 is speed, i am using a bunch of spin controls for my data // as well so i will put one in here but it should not make a differance right? m_SPINVALUE.GetWindowText(someText2);   // now we have the two values stored as CStrings //from string to float (i am aware of the possible loss of exactness here // but until i figure out Vc++ a little more i am stuck float distance = atof(LPCSTR(someText)); float speed = atof(LPCSTR(someText2)); float time = distance/speed;   //MY PROBLEM IS HERE TAKING THIS FLOAT AND PUTTING BACK INTO // THE DIALOG BOXES //back to string is where i am struggling with // i know the ftoa is NOT correct but I can not seem to find any other // way of doing it char buffer[256]; ftoa(time,buffer,10); MessageBox(buffer); }   any thoughts or suggestions on how i might do this would be appreciated   Thank You Ed Storey
 Re: Visual C++ float to string conversion problem Codin' Carlos 20-Jan-02 13:20
 Re: Visual C++ float to string conversion problem tomasusan 7-Nov-06 8:41
 Fast divides sigfpe 22-Nov-00 10:01
 Maintaining Significant Figures Dave Aebi 28-Sep-00 16:12
 SERIOUS Performance improvements for sqrt Simon Hughes 27-Sep-00 5:18
 Re: SERIOUS Performance improvements for sqrt Steven J. Ackerman 29-Sep-00 11:00
 Re: SERIOUS Performance improvements for sqrt Simon Hughes 2-Oct-00 3:12
 Re: SERIOUS Performance improvements for sqrt Steven J. Ackerman 2-Oct-00 7:18
 Re: SERIOUS Performance improvements for sqrt DQNOK 18-Apr-07 12:20
 SERIOUS Performance improvements for 1/n Simon Hughes 27-Sep-00 5:14
 Re: SERIOUS Performance improvements for 1/n emilio_g 18-Nov-03 23:02
 Re: SERIOUS Performance improvements for 1/n Simon Hughes 18-Nov-03 23:59
 Re: SERIOUS Performance improvements for 1/n JasonDoucette 18-Oct-04 10:21
 SERIOUS Performance improvements for RoundValue Simon Hughes 27-Sep-00 5:12
 Re: SERIOUS Performance improvements for RoundValue Ilia Kirsanau 2-Jun-03 21:39
 Re: SERIOUS Performance improvements for RoundValue Simon Hughes 19-Nov-03 0:10
 Re: SERIOUS Performance improvements for RoundValue bob16972 31-Jan-08 5:14
 Round to Nearest X function Martin MacRobert 26-Sep-00 12:38
 Re: Round to Nearest X function Gene 17-Jul-01 10:22
 Re: Round to Nearest X function mier 26-Nov-03 2:11
 Re: Round to Nearest X function zPilott 27-Oct-04 9:51
 Rounding errata. DerekDaz 22-Sep-00 10:21
 Re: Rounding errata. Arlynn Smith 25-Sep-00 3:49
 Re: Rounding errata. reman 18-Nov-03 17:42
 Re: Rounding errata. Mosc 12-Jul-07 6:58
 Comparing floating point values for equality John Simmons / outlaw programmer 20-Sep-00 3:32
 Re: Comparing floating point values for equality Terence Russell 24-Sep-00 21:20
 Re: Comparing floating point values for equality Thomas Haase 25-Nov-03 21:15
 I use this for rounding John Simmons / outlaw programmer 20-Sep-00 3:25
 Re: I use this for rounding Hemme_one 26-Nov-03 0:05
 A Better FloatToText() function? John Simmons / outlaw pogrammer 20-Sep-00 3:13
 Re: A Better FloatToText() function? Jim Wuerch 20-Sep-00 8:34
 Re: A Better FloatToText() function? John Simmons / outlaw programmer 21-Sep-00 0:26
 Re: A Better FloatToText() function? John Simmons / outlaw programmer 21-Sep-00 1:01
 Re: A Better FloatToText() function? Jim Wuerch 24-Sep-00 18:31
 Last Visit: 31-Dec-99 18:00     Last Update: 1-Sep-14 4:43 Refresh 12 Next »