Click here to Skip to main content
Sign Up to vote bad
good
See more: C++binary
I am required to write a code which multiplies two floating point numbers in binary format. The code works fine when I give an integer input, but doesn't give correct result for floating point input.
 
When I give an input for example:
 
3.2 and 12.2
 

I am getting this answer:
 
1010011100010000
 
but I must get this
 
1001110000101000
 
Here is the link to my code:
 
http://ideone.com/q8ned7[^]
 
Kindly help me. Its a matter of my exams
Posted 12 Jan '13 - 22:42

Comments
Richard MacCutchan - 13 Jan '13 - 6:12
You need to provide considerably more detail than this. People are not going to follow external links just to try and figure out what your code is supposed to be doing.

3 solutions

In my first solution to your question I took your question too literal: you ask for a floating point solution.
Reading now your expected result shows that you in fact talk about a fixed point problem.
 
A fix point approach places the separation between integral part and the fraction part at a fixed position within the binary format.
 
E.g. with 32 bit format, 22 integral bits and 10 fraction bits.
12.2 = 1100.0011001100
 3.2 =   11.0011001100
In fix-point arithmetic, you can multiply as plain unsigned integer and shift the result by the proper amount of fractional bit back. To get a more accurate result, you may add two additional rounding bits, resulting in 10 + 2 fraction bit, where only the first 10 bits are taken and the last two are used for rounding.
 
E.g. with 32 bit format, 20 integral bits and 10+2 fraction bits.
12.2 = 1100.0011001100[11]
 3.2 =   11.0011001100[11]
 
It can be viewed as scaling: in this example you have 10+2 fractional bits, hence, scaled by 2(10+2) = 4096.
12.2 x 4096 = 49971.2 --> 49971
 3.2 x 4096 = 13107.2 --> 13107
 
49971 x 13107 = 654969897
 
654969897 / 4096 = 159904.76 --> 159904
159904 / 4096 = 39.0392 --> 39.04
 
159904 --> 100111.0000101000[00]
 
Algorithm:
1) scale decimal numbers by 4096 and store as integral bit pattern
2) multiply the integral bit patterns as integer
3) divide result by 4096 and take the resulting bit pattern as fixpoint number
4) for printing: leave the two rounding bits away
 
E.g.
   1)    12.2 -->               1100.0011001100[11]
          3.2 -->                 11.0011001100[11]
   2)    mult --> 100111000010100000.[110000101001]
   3)   scale -->             100111.0000101000[00]
   4)   print --> 39.0390625 --> 39.04
 
To avoid overflow, you should of course merge mult and scale back (steps 2 and 3 above) into one operation taking care to throw away the lower 10+2 result fraction bits while multiplying.
 
Cheers
Andi
  Permalink  
Comments
nv3 - 15 Jan '13 - 3:28
It was probably more work to figure out what OP wanted than to give the right answer. You did a great job finding that out! +5 (and OP even accepted the wrong answer!!)
Andreas Gieriet - 15 Jan '13 - 4:14
Thanks for your 5! He did not accept explicitly. It looks like if there is a certain number of votes for a solution, then that solution gets this light green color. An explicitly accepted solution is plain green. Anyways, I could not resist to re-visit and get an alternative answer ;-) Cheers Andi
nv3 - 15 Jan '13 - 4:23
Ah, I see. The faint green bar marks answers that got three or more positive ratings.
I assume you talk about IEEE 754[^] floating point numbers.
Such a number has
  • sign
  • mantissa
  • exponent (which is shifted by a bias)
  • some special values that indicate +/- infinity, not-a-number (NaN)
 
Leaving away the special cases, you must know
  1. how the sign, mantissa, and exponent are stored in bits[^]
  2. what normalizing[^] means
  3. how to multiply
    • r.Sign = a.Sign ^ b.Sign
    • r.Mantissa = a.Mantissa * b.Mantissa
    • r.Exponent = FLOAT_BIAS + (a.Exponent - FLOAT_BIAS) + (b.Exponent - FLOAT_BIAS)
    • r.Normalize()
This as such is not robust enough at the boundaries of the value range: before normalizing, you might already got an overflow or unterflow. But the principle is still right.
 
For technical purposes, use union and bitfields:
  • one aspect of the union is the unsigned integer
  • the other is the bitfields that denote the sign, mantissa, and exponent
e.g. for single presision[^] (float):
const int S_BITS = 1;
const int E_BITS = 8;
const int M_BITS = 23;
union Float
{
   unsigned long raw;
   struct {
       unsigned int mantissa : M_BITS;
       unsigned int exponent : E_BITS;
       unsigned int sign     : S_BITS;
   } bits;
};
This memory layout depends on the computer architecture on which you run your program. E.g. see MSDN: C++ bit fields[^]
 
Good luck!
 
Cheers
Andi
  Permalink  
Call me paranoid but I did not follow your link (sorry).
 
By "floating point", I assume you are talking about an arbitrary number of digits in "xxx.xxx" format and not the floating point values stored in float and double values, which are handled by machine hardware.
 
The first thing you need to do is break your number down into binary; after that do the math on the result. Digits ("bits") to the left of the decimal point (sic - actually binary point) follow the same rules as for translating integers to binary. Bits to the right of the binary point are what will make your head hurt, and correspond to 0.5, 0.25, 0.125... etc.
 
Understanding the previous paragraph should solve the problem for you.
  Permalink  

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
Your Filters
Interested
Ignored
     
0 Sergey Alexandrovich Kryukov 514
1 CPallini 245
2 Mahesh Bailwal 244
3 Maciej Los 240
4 Aarti Meswania 213
0 Sergey Alexandrovich Kryukov 9,162
1 OriginalGriff 7,179
2 CPallini 3,913
3 Rohan Leuva 3,176
4 Maciej Los 2,588


Advertise | Privacy | Mobile
Web02 | 2.6.130516.1 | Last Updated 14 Jan 2013
Copyright © CodeProject, 1999-2013
All Rights Reserved. Terms of Use
Layout: fixed | fluid