Click here to Skip to main content
Rate this: bad
good
Please Sign up or sign in to vote.
See more: C++ binary
I am required to write a code which multiplies two floating point numbers in binary format. The code works fine when I give an integer input, but doesn't give correct result for floating point input.
 
When I give an input for example:
 
3.2 and 12.2
 

I am getting this answer:
 
1010011100010000
 
but I must get this
 
1001110000101000
 
Here is the link to my code:
 
http://ideone.com/q8ned7[^]
 
Kindly help me. Its a matter of my exams
Posted 12-Jan-13 23:42pm
Comments
Richard MacCutchan at 13-Jan-13 6:12am
   
You need to provide considerably more detail than this. People are not going to follow external links just to try and figure out what your code is supposed to be doing.
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 3

In my first solution to your question I took your question too literal: you ask for a floating point solution.
Reading now your expected result shows that you in fact talk about a fixed point problem.
 
A fix point approach places the separation between integral part and the fraction part at a fixed position within the binary format.
 
E.g. with 32 bit format, 22 integral bits and 10 fraction bits.
12.2 = 1100.0011001100
 3.2 =   11.0011001100
In fix-point arithmetic, you can multiply as plain unsigned integer and shift the result by the proper amount of fractional bit back. To get a more accurate result, you may add two additional rounding bits, resulting in 10 + 2 fraction bit, where only the first 10 bits are taken and the last two are used for rounding.
 
E.g. with 32 bit format, 20 integral bits and 10+2 fraction bits.
12.2 = 1100.0011001100[11]
 3.2 =   11.0011001100[11]
 
It can be viewed as scaling: in this example you have 10+2 fractional bits, hence, scaled by 2(10+2) = 4096.
12.2 x 4096 = 49971.2 --> 49971
 3.2 x 4096 = 13107.2 --> 13107
 
49971 x 13107 = 654969897
 
654969897 / 4096 = 159904.76 --> 159904
159904 / 4096 = 39.0392 --> 39.04
 
159904 --> 100111.0000101000[00]
 
Algorithm:
1) scale decimal numbers by 4096 and store as integral bit pattern
2) multiply the integral bit patterns as integer
3) divide result by 4096 and take the resulting bit pattern as fixpoint number
4) for printing: leave the two rounding bits away
 
E.g.
   1)    12.2 -->               1100.0011001100[11]
          3.2 -->                 11.0011001100[11]
   2)    mult --> 100111000010100000.[110000101001]
   3)   scale -->             100111.0000101000[00]
   4)   print --> 39.0390625 --> 39.04
 
To avoid overflow, you should of course merge mult and scale back (steps 2 and 3 above) into one operation taking care to throw away the lower 10+2 result fraction bits while multiplying.
 
Cheers
Andi
  Permalink  
Comments
nv3 at 15-Jan-13 3:28am
   
It was probably more work to figure out what OP wanted than to give the right answer. You did a great job finding that out! +5 (and OP even accepted the wrong answer!!)
Andreas Gieriet at 15-Jan-13 4:14am
   
Thanks for your 5!
He did not accept explicitly. It looks like if there is a certain number of votes for a solution, then that solution gets this light green color. An explicitly accepted solution is plain green.
Anyways, I could not resist to re-visit and get an alternative answer ;-)
Cheers
Andi
nv3 at 15-Jan-13 4:23am
   
Ah, I see. The faint green bar marks answers that got three or more positive ratings.
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 1

I assume you talk about IEEE 754[^] floating point numbers.
Such a number has
  • sign
  • mantissa
  • exponent (which is shifted by a bias)
  • some special values that indicate +/- infinity, not-a-number (NaN)
 
Leaving away the special cases, you must know
  1. how the sign, mantissa, and exponent are stored in bits[^]
  2. what normalizing[^] means
  3. how to multiply
    • r.Sign = a.Sign ^ b.Sign
    • r.Mantissa = a.Mantissa * b.Mantissa
    • r.Exponent = FLOAT_BIAS + (a.Exponent - FLOAT_BIAS) + (b.Exponent - FLOAT_BIAS)
    • r.Normalize()
This as such is not robust enough at the boundaries of the value range: before normalizing, you might already got an overflow or unterflow. But the principle is still right.
 
For technical purposes, use union and bitfields:
  • one aspect of the union is the unsigned integer
  • the other is the bitfields that denote the sign, mantissa, and exponent
e.g. for single presision[^] (float):
const int S_BITS = 1;
const int E_BITS = 8;
const int M_BITS = 23;
union Float
{
   unsigned long raw;
   struct {
       unsigned int mantissa : M_BITS;
       unsigned int exponent : E_BITS;
       unsigned int sign     : S_BITS;
   } bits;
};
This memory layout depends on the computer architecture on which you run your program. E.g. see MSDN: C++ bit fields[^]
 
Good luck!
 
Cheers
Andi
  Permalink  
v2
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 2

Call me paranoid but I did not follow your link (sorry).
 
By "floating point", I assume you are talking about an arbitrary number of digits in "xxx.xxx" format and not the floating point values stored in float and double values, which are handled by machine hardware.
 
The first thing you need to do is break your number down into binary; after that do the math on the result. Digits ("bits") to the left of the decimal point (sic - actually binary point) follow the same rules as for translating integers to binary. Bits to the right of the binary point are what will make your head hurt, and correspond to 0.5, 0.25, 0.125... etc.
 
Understanding the previous paragraph should solve the problem for you.
  Permalink  

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



Advertise | Privacy | Mobile
Web01 | 2.8.141220.1 | Last Updated 14 Jan 2013
Copyright © CodeProject, 1999-2014
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100