Click here to Skip to main content
15,892,746 members
Please Sign up or sign in to vote.
2.00/5 (1 vote)
See more:
Hi in my C++ code I have 2 double variables with value .4 and .1


When I subtract .1 from .4 the value I am getting is .30000000004 .May I know why the compiler is giving such a value and how can I solve the issue.


eg code
double d = .4;
double dec = .1

double result =  d- dec;


due to this when I subtract .1 from .1 I am getting .00000000027 instead of zeo.
Posted
Updated 26-Apr-11 22:35pm
v2
Comments
Olivier Levrey 27-Apr-11 4:35am    
Edited pre tags.

It happens because numbers aren't stored in base ten - they are stored in binary. This means that what is a precise number is base 10 is not so precise in base 2:

There is a guide to the details here[^] which explains it better than I could: see about half way down under "Converting Decimal Fractions to Binary Reals"
 
Share this answer
 
Comments
Sergey Alexandrovich Kryukov 27-Apr-11 21:16pm    
Floating confusion detected. You give the right idea, I vote 5.
--SA
Ajay Vijayvargiya 30-Apr-11 2:48am    
It has nothing do with base of number. Integer numbers are stored as actual values, real numbers are stored as formula (mantissa, exponent). See IEEE for the explanation. Therefore, when you retrieve the float/double number, it is CALCULATED, and therefore may be bit imprecise.
OriginalGriff 30-Apr-11 2:56am    
It has everything to do with the number base: all values are stored in base 2! When you create a floating point number the base ten version is converted to a base 2 floating point format (see IEEE for details) and converted back to base ten when you convert it for printing! See the link or read IEEE yourself.
Ajay Vijayvargiya 30-Apr-11 3:09am    
Same thing I said buddy, all numbers are stored in binary. So, you cannot say int is stored as base-2, and double is stored as base-10. Both are stored as sequence of bits (and therefore we have datattype-sizes). May be my statement was somewhat elusive...
If calculation accuracy is important, I suggest you write or find a class that handles rational numbers, using an integer numerator and denominator. That way you can rid yourself of, or a least postpone, the rounding error until presentation.
 
Share this answer
 
Comments
Stefan_Lang 28-Apr-11 7:09am    
My 5, since this is the only correct answer I see that leads to an actual solution.

I know that some commercial programs such as Mathematica already do this, but I'm not aware whether these can be used as a library for your own application, or what such a library would cost. Maybe there are free solutions elsewhere on the internet.
Niklas L 28-Apr-11 8:01am    
Thanks! I found a boost implementation:
http://www.boost.org/doc/libs/1_38_0/boost/rational.hpp
Olivier Levrey 28-Apr-11 8:11am    
Yes, nice solution. Have a 5.
Of course, this works only for rational numbers (forget about pi, sqrt(2), log, and so on), but as far as we see from Op's code, this should be enough.
100% precision computation on float or double are not possible and will never be. You have to live with that.

Read that to understand how floating point operations work:
http://en.wikipedia.org/wiki/Floating_point[^]

If you don't want any rounding errors, then I suggest you to work with int, long, or int64 types.
 
Share this answer
 
Comments
Sergey Alexandrovich Kryukov 27-Apr-11 21:15pm    
Some "alternatively educated" person walked through this page and voted in a matching manner. I'm trying to fix it. You get 5.
--SA
Niklas L 28-Apr-11 2:45am    
"alternatively educated" :D
Olivier Levrey 28-Apr-11 4:57am    
Thank you SA. I liked the "alternatively educated" expression. I will re-use it :)
Stefan_Lang 28-Apr-11 7:17am    
While I like Niklas' suggestion better for it's potential to completely eliminate inaccuracies, your answer is probably the most practical one, especially the suggestion to work with int types wherever you can.

A possible extension to this suggestion would be to use fixed point arithmetic, i. e. using int types to store values, but implicitely using a decimal exponent. E. g. a stored value of 32145 corresponds to an actual value of 321.45. This can be useful e. g. in financial contexts.
Olivier Levrey 28-Apr-11 8:13am    
Yes you are right. Thank you for this useful comment.
Hi,
the issue lies within the fact that double precision numbers are stored in 64 bits thus, irrational values are cropped (or rounded).
.4 = 4/10 = 2/5

2/5 can never be stored with infinite precision, thus any math performed on irrational numbers (irrational for PC, meaning that it cannot be expressed exact) will generate errors. Sometimes these errors generate a huge impact on the result if the application uses heavily math based on small irrational doubles. You can read more about double storage here: http://en.wikipedia.org/wiki/Double_precision_floating-point_format[^]
Regards
 
Share this answer
 
v2
Hi I found one solution for the above problem.

sample code is attached below

C#
static void Main(string[] args)
        {
            double d = -.000000000000004;
            double k = .000000000000001;
            double re = d - k;

            string st = re.ToString();

            re = System.Double.Parse(st);
        }



Convert the double to string and convert the string back to double then the junk values will get truncated.
 
Share this answer
 
Comments
Stefan_Lang 28-Apr-11 7:03am    
This will only work for rational numbers that have a finite exact representation in decimal, and even then you have to be careful to not accidentally truncate relevant digits. It will not work for values such as 1/3
C#
void DoubleCurrection(ref double dValue)
        {
            string stVal;
            if (dValue > 1E-11)
                stVal = dValue.ToString("F16");
            else if (dValue > 1E-15)
                stVal = dValue.ToString("F20");
            else if (dValue > 1E-20)
                stVal = dValue.ToString("F25");
            else
                stVal = dValue.ToString("F30");
            dValue = System.Double.Parse(stVal);
        }
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900