Beware of Doubles, Floats, and generally floating points…
I’ve recently encountered few people who were naive about the representations of real (i.e not integers) numbers in computer languages , like double and float in Java (and their counterparts in other languages).
Since examples are worth much more then explanations, lets looks at this:
1
2
3
4
5
double y = 0.082;
double z = 8.2;
System.out.println("" + y +" * 100 = " + z + "? " +
(100 * y == z) + " it equals " + (100 * y) );
and the output is:
0.082 * 100 = 8.2? false it equals 8.200000000000001
The next example shows exactly why you should be wary of floating points. Look at the binary representation after each iteration.
1
2
3
4
5
6
7
8
9
10
11
12
public static void main(String[] args) {
Double x = 1.0 / 3.0;
for (int i=0; i= 1) {
System.out.println("Iteration: " + i + " x = " + x);
System.out.println("Binary representation: " +
Long.toBinaryString(Double.doubleToRawLongBits(x)));
x = x * 2;
if (x >= 1) {
x = x - 1;
}
}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
> Iteration: 0 x = 0.3333333333333333
> Binary representation: 11111111010101010101010101010101010101010101010101010101010101
> Iteration: 1 x = 0.6666666666666666
> Binary representation: 11111111100101010101010101010101010101010101010101010101010101
> Iteration: 2 x = 0.33333333333333326
> Binary representation: 11111111010101010101010101010101010101010101010101010101010100
> Iteration: 3 x = 0.6666666666666665
> Binary representation: 11111111100101010101010101010101010101010101010101010101010100
> Iteration: 4 x = 0.33333333333333304
> Binary representation: 11111111010101010101010101010101010101010101010101010101010000
> Iteration: 5 x = 0.6666666666666661
> Binary representation: 11111111100101010101010101010101010101010101010101010101010000
> Iteration: 6 x = 0.33333333333333215
> Binary representation: 11111111010101010101010101010101010101010101010101010101000000
> Iteration: 7 x = 0.6666666666666643
> Binary representation: 11111111100101010101010101010101010101010101010101010101000000
> Iteration: 8 x = 0.3333333333333286
> Binary representation: 11111111010101010101010101010101010101010101010101010100000000
> Iteration: 9 x = 0.6666666666666572
> Binary representation: 11111111100101010101010101010101010101010101010101010100000000
> Iteration: 10 x = 0.3333333333333144
> Binary representation: 11111111010101010101010101010101010101010101010101010000000000
> Iteration: 11 x = 0.6666666666666288
> Binary representation: 11111111100101010101010101010101010101010101010101010000000000
> Iteration: 12 x = 0.33333333333325754
> Binary representation: 11111111010101010101010101010101010101010101010101000000000000
> Iteration: 13 x = 0.6666666666665151
> Binary representation: 11111111100101010101010101010101010101010101010101000000000000
> Iteration: 14 x = 0.33333333333303017
> Binary representation: 11111111010101010101010101010101010101010101010100000000000000
> Iteration: 15 x = 0.6666666666660603
> Binary representation: 11111111100101010101010101010101010101010101010100000000000000
> Iteration: 16 x = 0.3333333333321207
> Binary representation: 11111111010101010101010101010101010101010101010000000000000000
> Iteration: 17 x = 0.6666666666642413
> Binary representation: 11111111100101010101010101010101010101010101010000000000000000
> Iteration: 18 x = 0.3333333333284827
> Binary representation: 11111111010101010101010101010101010101010101000000000000000000
> Iteration: 19 x = 0.6666666666569654
> Binary representation: 11111111100101010101010101010101010101010101000000000000000000
> Iteration: 20 x = 0.3333333333139308
> Binary representation: 11111111010101010101010101010101010101010100000000000000000000
> Iteration: 21 x = 0.6666666666278616
> Binary representation: 11111111100101010101010101010101010101010100000000000000000000
> Iteration: 22 x = 0.3333333332557231
> Binary representation: 11111111010101010101010101010101010101010000000000000000000000
> Iteration: 23 x = 0.6666666665114462
> Binary representation: 11111111100101010101010101010101010101010000000000000000000000
> Iteration: 24 x = 0.3333333330228925
> Binary representation: 11111111010101010101010101010101010101000000000000000000000000
> Iteration: 25 x = 0.666666666045785
> Binary representation: 11111111100101010101010101010101010101000000000000000000000000
> Iteration: 26 x = 0.3333333320915699
> Binary representation: 11111111010101010101010101010101010100000000000000000000000000
> Iteration: 27 x = 0.6666666641831398
> Binary representation: 11111111100101010101010101010101010100000000000000000000000000
> Iteration: 28 x = 0.3333333283662796
> Binary representation: 11111111010101010101010101010101010000000000000000000000000000
> Iteration: 29 x = 0.6666666567325592
> Binary representation: 11111111100101010101010101010101010000000000000000000000000000
> Iteration: 30 x = 0.3333333134651184
> Binary representation: 11111111010101010101010101010101000000000000000000000000000000
> Iteration: 31 x = 0.6666666269302368
> Binary representation: 11111111100101010101010101010101000000000000000000000000000000
> Iteration: 32 x = 0.33333325386047363
> Binary representation: 11111111010101010101010101010100000000000000000000000000000000
> Iteration: 33 x = 0.6666665077209473
> Binary representation: 11111111100101010101010101010100000000000000000000000000000000
> Iteration: 34 x = 0.33333301544189453
> Binary representation: 11111111010101010101010101010000000000000000000000000000000000
> Iteration: 35 x = 0.6666660308837891
> Binary representation: 11111111100101010101010101010000000000000000000000000000000000
> Iteration: 36 x = 0.3333320617675781
> Binary representation: 11111111010101010101010101000000000000000000000000000000000000
> Iteration: 37 x = 0.6666641235351562
> Binary representation: 11111111100101010101010101000000000000000000000000000000000000
> Iteration: 38 x = 0.3333282470703125
> Binary representation: 11111111010101010101010100000000000000000000000000000000000000
> Iteration: 39 x = 0.666656494140625
> Binary representation: 11111111100101010101010100000000000000000000000000000000000000
> Iteration: 40 x = 0.33331298828125
> Binary representation: 11111111010101010101010000000000000000000000000000000000000000
> Iteration: 41 x = 0.6666259765625
> Binary representation: 11111111100101010101010000000000000000000000000000000000000000
> Iteration: 42 x = 0.333251953125
> Binary representation: 11111111010101010101000000000000000000000000000000000000000000
> Iteration: 43 x = 0.66650390625
> Binary representation: 11111111100101010101000000000000000000000000000000000000000000
> Iteration: 44 x = 0.3330078125
> Binary representation: 11111111010101010100000000000000000000000000000000000000000000
> Iteration: 45 x = 0.666015625
> Binary representation: 11111111100101010100000000000000000000000000000000000000000000
> Iteration: 46 x = 0.33203125
> Binary representation: 11111111010101010000000000000000000000000000000000000000000000
> Iteration: 47 x = 0.6640625
> Binary representation: 11111111100101010000000000000000000000000000000000000000000000
> Iteration: 48 x = 0.328125
> Binary representation: 11111111010101000000000000000000000000000000000000000000000000
> Iteration: 49 x = 0.65625
> Binary representation: 11111111100101000000000000000000000000000000000000000000000000
> Iteration: 50 x = 0.3125
> Binary representation: 11111111010100000000000000000000000000000000000000000000000000
> Iteration: 51 x = 0.625
> Binary representation: 11111111100100000000000000000000000000000000000000000000000000
> Iteration: 52 x = 0.25
> Binary representation: 11111111010000000000000000000000000000000000000000000000000000
> Iteration: 53 x = 0.5
> Binary representation: 11111111100000000000000000000000000000000000000000000000000000
> Iteration: 54 x = 0.0
> Binary representation: 0
Why does it happens?
Because if you use 64 bits to store a floating point number, you use 1 for the sign (+/-), 11 for the mantissa, and 52 for the value itself. And after adding 52 trailing zeroes …
In each deduction we remove the msb (most significan bit) and add a trailing 0 as a lsb (least significant bit) - and that’s - in a nutshell - what kills our precision. Naturally, there are ways to get over it, but you should be aware that you need to use these methods.
You can read more about it here:
http://kipirvine.com/asm/workbook/floating_tut.htm
http://support.microsoft.com/kb/42980
http://en.wikipedia.org/wiki/Single-precision_floating-point_format
Comments powered by Disqus.