## Introduction

I enjoy Game Programming with Directx and I noticed that the most called method throughout most of my games is the standard **sqrt** method in the *Math.h* and this made me search for faster functions than the standard **sqrt**. And after some searching, I found lots of functions that were much much faster but it's always a compromise between speed and precision. The main purpose of this article is to help people choose the best square-root method that suits their program.

## Background

In this article, I compare 14 different methods for computing the square root with the standard **sqrt** function as a reference, and for each method I show its precision and speed compared to the **sqrt** method.

## What this Article is Not About

- Explaining how each method works
- New ways to compute the square root

## Using the Code

The code is simple, it basically contains:

**1. main.cpp **

Calls all the methods and for each one of them, it computes the speed and precision relative to the **sqrt **function.

**2. SquareRootmethods.h **

This Header contains the implementation of the functions, and the reference of where I got them from.

First I calculate the Speed and Precision of the **sqrt** method which will be my reference.

For computing the Speed, I measure the time it takes to call the **sqrt **function (M-1) times and I assign this value to `RefSpeed`

which will be my reference.

And for computing the Precision, I add the current result to the previous result in `RefTotalPrecision`

every time I call the method. ```
```

`RefTotalPrecisi`

on

will be my reference.

For measuring runtime duration (Speed) of the methods, I use the `CDuration`

class found in this link, and I use `dur`

as an instance of that class.

for(int j=0;j<AVG;j++)
{
dur.Start();
for(int i=1;i<M;i++)
RefTotalPrecision+=sqrt((float) i);
dur.Stop();
Temp+=dur.GetDuration();
}
RefTotalPrecision/=AVG;
Temp/=AVG;
RefSpeed=(float)(Temp)/CLOCKS_PER_SEC;

And for the other methods I do the same calculations, but in the end, I reference them to the **sqrt**.

for(int j=0;j<AVG;j++)
{
dur.Start();
for(int i=1;i<M;i++)
TotalPrecision+=sqrt1((float) i);
dur.Stop();
Temp+=dur.GetDuration();
}
TotalPrecision/=AVG;
Temp/=AVG;
Speed=(float)(Temp)/CLOCKS_PER_SEC;
cout<<"Precision = "
<<(double)(1-abs((TotalPrecision-RefTotalPrecision)/(RefTotalPrecision)))*100<<endl;

**NOTES:**

- I assume that the error in Precision whether larger or smaller than the reference is equal, that's why I use "abs".
- The Speed is referenced as the actual percentage, while the Precision is referenced as a decrease percentage.

You can modify the value of `M`

as you like, I initially assign it with 10000.

You can modify `AVG`

as well, the higher it is, the more accurate the results.

#define M 10000
#define AVG 10

## Points of Interest

Precision wise, the **sqrt** standard method is the best. But the other functions can be much faster even 5 times faster. I would personally choose Method N# 14 as it has high precision and high speed, but I'll leave it for you to choose. :)

I took 5 samples and averaged them and here is the output:

**According to the analysis the above Methods Performance Ranks (Speed x Precision) is:**

**NOTE:** The performance of these methods depends highly on your processor and may change from one computer to another.

## The METHODS

### Sqrt1

**Reference: **http://ilab.usc.edu/wiki/index.php/Fast_Square_Root

**Algorithm: **Babylonian Method + some manipulations on IEEE 32 bit floating point representation

float sqrt1(const float x)
{
union
{
int i;
float x;
} u;
u.x = x;
u.i = (1<<29) + (u.i >> 1) - (1<<22);
u.x = u.x + x/u.x;
u.x = 0.25f*u.x + x/u.x;
return u.x;
}

### Sqrt2

**Reference:** http://ilab.usc.edu/wiki/index.php/Fast_Square_Root

**Algorithm:** The Magic Number (Quake 3)

#define SQRT_MAGIC_F 0x5f3759df
float sqrt2(const float x)
{
const float xhalf = 0.5f*x;
union {
float x;
int i;
} u;
u.x = x;
u.i = SQRT_MAGIC_F - (u.i >> 1); return x*u.x*(1.5f - xhalf*u.x*u.x);}

### Sqrt3

**Reference:** http://ilab.usc.edu/wiki/index.php/Fast_Square_Root

**Algorithm:** Log base 2 approximation and Newton's Method

float sqrt3(const float x)
{
union
{
int i;
float x;
} u;
u.x = x;
u.i = (1<<29) + (u.i >> 1) - (1<<22);
return u.x;
}

### Sqrt4

**Reference:** I got it a long time a go from a forum and I forgot, please contact me if you know its reference.

Algorithm: Bakhsali Approximation

float sqrt4(const float m)
{
int i=0;
while( (i*i) <= m )
i++;
i--;
float d = m - i*i;
float p=d/(2*i);
float a=i+p;
return a-(p*p)/(2*a);
}

### Sqrt5

**Reference: http://www.dreamincode.net/code/snippet244.htm**** **

**Algorithm:** Babylonian Method

float sqrt5(const float m)
{
float i=0;
float x1,x2;
while( (i*i) <= m )
i+=0.1f;
x1=i;
for(int j=0;j<10;j++)
{
x2=m;
x2/=x1;
x2+=x1;
x2/=2;
x1=x2;
}
return x2;
}

### Sqrt6

**Reference:** http://www.azillionmonkeys.com/qed/sqroot.html#calcmeth

**Algorithm:** Dependant on IEEE representation and only works for 32 bits

double sqrt6 (double y)
{
double x, z, tempf;
unsigned long *tfptr = ((unsigned long *)&tempf) + 1;
tempf = y;
*tfptr = (0xbfcdd90a - *tfptr)>>1;
x = tempf;
z = y*0.5;
x = (1.5*x) - (x*x)*(x*z); x = (1.5*x) - (x*x)*(x*z);
return x*y;
}

### Sqrt7

**Reference:** http://bits.stephan-brumme.com/squareRoot.html

**Algorithm:** Dependant on IEEE representation and only works for 32 bits

float sqrt7(float x)
{
unsigned int i = *(unsigned int*) &x;
i += 127 << 23;
i >>= 1;
return *(float*) &i;
}

### Sqrt8

**Reference:** http://forums.techarena.in/software-development/1290144.htm

**Algorithm:** **Babylonian Method **

double sqrt9( const double fg)
{
double n = fg / 2.0;
double lstX = 0.0;
while(n != lstX)
{
lstX = n;
n = (n + fg/n) / 2.0;
}
return n;
}

### Sqrt9

**Reference:** http://www.functionx.com/cpp/examples/squareroot.htm

**Algorithm: **Babylonian Method

double Abs(double Nbr)
{
if( Nbr >= 0 )
return Nbr;
else
return -Nbr;
}
double sqrt10(double Nbr)
{
double Number = Nbr / 2;
const double Tolerance = 1.0e-7;
do
{
Number = (Number + Nbr / Number) / 2;
}while( Abs(Number * Number - Nbr) > Tolerance);
return Number;
}

### Sqrt10

**Reference:** http://www.cs.uni.edu/~jacobson/C++/newton.html

**Algorithm:** Newton's Approximation Method

double sqrt11(const double number)e
{
const double ACCURACY=0.001;
double lower, upper, guess;
if (number < 1)
{
lower = number;
upper = 1;
}
else
{
lower = 1;
upper = number;
}
while ((upper-lower) > ACCURACY)
{
guess = (lower + upper)/2;
if(guess*guess > number)
upper =guess;
else
lower = guess;
}
return (lower + upper)/2;
}

### Sqrt11

**Reference:** http://www.drdobbs.com/184409869;jsessionid=AIDFL0EBECDYLQE1GHOSKH4ATMY32JVN

**Algorithm:** Newton's Approximation Method

double sqrt12( unsigned long N )
{
double n, p, low, high;
if( 2 > N )
return( N );
low = 0;
high = N;
while( high > low + 1 )
{
n = (high + low) / 2;
p = n * n;
if( N < p )
high = n;
else if( N > p )
low = n;
else
break;
}
return( N == p ? n : low );
}

### Sqrt12

**Reference:** http://cjjscript.q8ieng.com/?p=32

**Algorithm:** **Babylonian Method **

double sqrt13( int n )
{
double a = (double) n;
double x = 1;
for( int i = 0; i < n; i++)
{
x = 0.5 * ( x+a / x );
}
return x;
}

### Sqrt13

**Reference:** N/A

**Algorithm:** Assembly fsqrt

double sqrt13(double n)
{
__asm{
fld n
fsqrt
}
}

### Sqrt14

**Reference:** N/A

**Algorithm:** Assembly fsqrt 2

double inline __declspec (naked) __fastcall sqrt14(double n)
{
_asm fld qword ptr [esp+4]
_asm fsqrt
_asm ret 8
}

## History

## 1.3 (15 September 2010)

- Added Method N#14 (which is the best method till now)
- Added modified source code

## 1.2 (24 June 2010)

- Added Method N#13
- Added the Methods Performance Rank
- Added modified source code

## 1.1 (3 April 2010)

- Added Precision Timer instead of clock because it's more precise
- Added the average feature

## 1.0 (31 March 2010)

I hope that this article would at least slightly help those who are interested in this issue.