|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Announcements
Chapters
Services
Feature Zones
|
IntroductionSome time ago I read an very interesting story about the Fastest way to format time. In this story a remarkable performance boost was created by optimizing integer divisions. This inspiring optimization was proposed by Jeffrey Sax (thanx) and it is based upon the following equivalence for a certain range of x = num / div; (1)
equals x = (num * mul) >> shift; (2)
From the story I got the idea to make a simple application that finds the values for How it worksThe division (1) is replaced by a multiply/shift (2) pair which is in fact an approximation of the original division. The optimization is based upon the fact that the multiply and shift operators are far faster than the division operator. Depending on the range of values (0..max) the variable The First the function checks if max / div = max * mul >> shift (3)
equals 1 / div = mul >> shift (4)
equals mul = (1<< shift) / div (5)
It is clear that the I think it is possible to calculate valid values for Based upon the same equivalence the performance of the modulo function, %, can also be improved. The modulo or remainder function (6) can be replaced with formula (7) that contains a division operation, and this can be rewritten to formula (8) based upon x = num % div; (6)
equals x = num - (num/div) * div; (7)
equals x = num - ((num * mul) >> shift) * div; (8)
The order of the expression (8) is important as in this order it reduces the chance of an overflow exception. Note that this expression is only valid for positive values of When to useThe optimization can be used in any application that implements a lot of integer divisions or modulos and in which the range of the numerator is known. Think of graphic libraries, processing very large (database)tables, averaging samples of real time sensors, Math libraries, webservices etc. I can also imagine that a programmer gives hints about the range of the variables passed to a function. The compiler can use this information to optimize the resulting code beyond current levels. In pseudocode it could look like below. The code [OPTIMIZE celcius={-273..10000}]
public int CelciusToFahrenheit(int celcius)
{
return (celcius - 32)/9 * 5;
}
could become to something like: [OPTIMIZED celcius={-273..10000}]
public int CelciusToFahrenheit(int celcius)
{
// @ORG: return (celcius - 32)/9 * 5;
int x = celcius - 32; // use register for x
if (x >=0 )
return (x * 3641 >> 15) * 5;
else
return -((-x) * 3641 >> 15) * 5;
}
Another way to use this optimizer is as a "Math Optimizer Plugin" for the programming environment. Such a plugin could include a whole range of 'tricks' like the bithacks collected by Sean Anderson. PerformanceThe src application makes some simple performance measurements. To see the influence of the debugger I ran the test both in debug and in release mode(.NET 2.0, 1.6 Ghz PC).
Because I use both negative and positive test values there is a 'complex' if then else structure in the test program to choose the right formula, especially for the modulo operator. Normally the denominator is fixed and the code can be much simpler of course. The last test of the application does a test for positive numbers only. Working with only positive numbers did also show improvements in the debug mode. Finally in the release code the overhead of handling the signs yourself seems to have no substantial influence. As the scheduling of the OS influences the measurements there will be differences when the test is run again. Performance gains will also differ on different hardware and for different programming languages. Just measure your own. FinaleIn short a substantial gain can be made by replacing integer divisions by a multiply/shift pair. Be aware that in the debug mode the optimized code can be slower, especially if you have both negative and positive numbers and you have to handle the sign yourself. Some items to improve the FindMulShift algorithm include:
History
Usage rightsEverybody is granted to use this code (at own risk) as long as you refer to the original work of Jeffrey Sax and this article.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||