Click here to Skip to main content
Click here to Skip to main content
Technical Blog

Tagged as

Math without a Floating Point Unit

, 26 Jan 2010 CPOL
Rate this:
Please Sign up or sign in to vote.
Math without a Floating Point Unit

The everyday development that I do targets desktop processors. The processors are more than capable of meeting the computational demands and performances requirements so it's rare that I ever have to think about how long it takes to execute a specific mathematical operation. But while that's the general case, it isn't always the case.

When I was finishing grad school, one of my projects dealt with machine vision and automatic classification of photographs. I had an understanding of the algorithms I needed to implement and how everything was going to fit together. During development of the individual components, I used small datasets that were sufficient for letting me know that the components were working as designed. It wasn't until I had brought all of the components together that I gave the system full sized images to process (around 2 megapixel). I knew that a full sized image would take longer to process, but processing of the image took around three hours! In researching the cause of the slowness, I found that some of the functions I was trying to execute were not natively supported by the machine's processor and were being emulated. Some of the math operations consumed 50 to 100 times more time than the native operations. There was a time constraint on the execution time of this program since I had to demonstrate its execution in class during a presentation. To stay within those constraints, the more expensive mathematical operations were replaced with lookup tables and the program was changed from just being multithreaded to taking full advantage of multiprocessing.

In the past few weeks, I was reminded of the experience after getting a couple of e-mails from other developers that were trying to figure out why their programs were having such poor performance. Both developers were creating programs that performed graphical processing and both were targeting Windows Mobile devices (which use ARMS processors). ARMS processors are available with a broad range of performance characteristics. On the lower end, the processors only support integer operations, have no divide instruction, and typically run around 200 MHz. On the upper end, the processors may have hardware implementation for floating point operations (including a divide instruction), built in 3D graphic accelerators, and run at up to 1 GHz. Both of the developers were testing their programs on devices that had no native floating point support and no divide instruction. So these operations were being emulated. That information alone was enough to answer their questions. But I decided to do a few measurements.

I dug up every ARMs based device running a Microsoft OS that I could find. In my possession, I have several Windows Mobile devices from PocketPC 2002 devices to a newly released device that will soon have Windows Mobile 6.5. The Microsoft Zune is also ARMS based. So I included it in my test. I've also got remote access to a few newly released devices. For the test, I had each device perform a million additions, subtractions, multiplications, and divisions for both integer numbers and double-precision floating point numbers. For the Windows Mobile devices, I did this using native (C-language) programs since the .NET Compact framework does not support floating point operations. For the Zunes I used .NET through the XNA framework. The Zune version of the .NET framework supports floating point operations. Because of the broad range of clock frequencies that these devices used for each device, I used the time taken to perform a million additions as a base measurement. My findings were fairly consistent. In general the devices that had no floating point support took about 30-times longer to perform a division on a double-precision number than the integer addition. Devices that had floating point support took about 2-times longer to perform a double-precision division than addition.

The algorithms that both developers were implementing made heavy use of floating point operations. There had been some expectations made about the performance of the programs when run on a mobile device that had been transferred from experience with desktops. Desktop processors have a more complete set of hardware implemented math operations than mobile processors. Both developers also had the same device which had no floating point support. So it comes as no surprise why the algorithms were running so slow. So what could they do about it? There's no universally satisfying substitute for a capable hardware so the exact solution is going to depend on what one is trying to accomplish. For one developer, an acceptable solution was to use a different algorithm that produced acceptable results and had a much lower computational demand. For the other developer, compromises on the result were not as acceptable so the hardware requirements for his software were revised so that they identified the needed hardware.

The experience from school and the experience of the two developers underscore the importance of making sure that solution's hardware and implementation are in sync with each other.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Joel Ivory Johnson
Software Developer Razorfish
United States United States
I attended Southern Polytechnic State University and earned a Bachelors of Science in Computer Science and later returned to earn a Masters of Science in Software Engineering.
 
For the past few years I've been providing solutions to clients using Microsoft technologies for web and Windows applications.
 
While most of my CodeProject.com articles are centered around Windows Phone it is only one of the areas in which I work and one of my interests. I also have interest in mobile development on Android and iPhone. Professionally I work with several Microsoft technologies including SQL Server technologies, Silverlight/WPF, ASP.Net and others. My recreational development interest are centered around Artificial Inteligence especially in the area of machine vision.
 
Twitter:@J2iNet
Follow on   Twitter   LinkedIn

Comments and Discussions

 
GeneralFloating-point processors help floating-point performance A LOT Pinmembersupercat926-Jan-10 7:11 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web04 | 2.8.141220.1 | Last Updated 26 Jan 2010
Article Copyright 2010 by Joel Ivory Johnson
Everything else Copyright © CodeProject, 1999-2014
Layout: fixed | fluid