Why does Java's HashMap copy constructor affect floating point precision?

Question

Why does Java's HashMap copy constructor affect floating point precision?

I had some code calculating a linear combination on float maps and came across an interesting side effect of using a copy constructor.

If I calculate a linear combination of values in two maps and compare it to a linear combination calculated using values in duplicate of those maps, the calculations are actually slightly different (in the vicinity of 10 ^ -7) to what appears to be floating point.

Why is this happening?

Here's some sample code:

import java.util.*;

public class WTF {
    public static void main(String[] args) {
        Random rand = new Random();

        for (int c = 0; c < 1000; c++) {
            Map<String, Float> weights = new HashMap<String, Float>();
            Map<String, Float> values = new HashMap<String, Float>();

            for (int j = 0; j < 10; j++) {
                weights.put("sig" + j, Float.valueOf(rand.nextFloat()));
                values.put("sig" + j, Float.valueOf(rand.nextFloat()));
            }

            Map<String, Float> weightsCopy = new HashMap<String, Float>(weights);
            Map<String, Float> valuesCopy = new HashMap<String, Float>(values);

            float score1 = getScore(weights, values);
            float score2 = getScore(weightsCopy, valuesCopy);

            if (score1 != score2) {
                System.out.println(score1-score2);
            }
        }
    }

    public static float getScore(Map<String, Float> weights, Map<String, Float> values) {
        float score = 0.0f;
        for (String name : weights.keySet()) {
            Float weight = weights.get(name);
            Float value = values.get(name);
            score += weight.floatValue() * value.floatValue();
        }
        return score;
    }
}

UPDATE:

This same problem also applies to surgery putAll

. Using this method to "copy" HashMap

results in the same floating point precision problems.

+3

java floating-point copy-constructor

pmc255 08 Mar '12 at 21:00

source to share

4 answers

The iteration order changes from original maps to copies as it rebuilds the hash table (possibly with a different size).

The rounding difference comes from the fact that *

both +

on floats are not quite commutative / associative and you will get different rounding errors depending on whether you do a * (b * c)

or (a * c) * b

or (a * b) * c

. As the order of records and keys changes between originals and copies, you end up with tiny, rounded differences in the results.

If you use LinkedHashMap

instead HashMap

to ensure the iteration order is preserved, you should get the same results every time. (I confirmed this on my machine.)

+5

Louis Wasserman 08 Mar 12 at 21:23

source to share

If you look at the float bit, you can see that one byte exponent and one mantisse bit (8 from the left) are swapped, so one bit is an error. ( 2,384186e-07 34800000

)

            float ds = score1-score2;
            int bits = Float.floatToIntBits(ds);
            System.out.printf("%e %x%n", score1-score2, bits);

0

Joop eggen 08 Mar 12 at 22:16

source to share

The order in which floating point numbers are added can affect the result. Since the HashMap does not guarantee any ordering, so copying the HashMap might result in a different ordering, which means the sum of the values will be different.

public static void main(String... args) throws IOException {
    List<Float> floats = new ArrayList<>();
    Random rand = new Random();
    float sum0 = 0;
    for (int i = 0; i < 1000; i++) {
        float f = rand.nextFloat() - rand.nextFloat();
        floats.add(f);
        sum0 += f;
    }
    floats.add(-sum0);

    SortedSet<Float> sums = new TreeSet<>();
    for (int i = 0; i < 200000; i++) {
        Collections.shuffle(floats, rand);
        float sum = 0;
        for (Float f : floats)
            sum += f;
        if (sums.add(sum))
            System.out.println(sum);
    }
    System.out.println("Unique sums count " + sums.size()
            + " from " + sums.first() + " to " + sums.last());
}

prints

1.8239021E-5
2.0623207E-5
-2.1278858E-5
1.847744E-5
2.18153E-5
  ....
-2.4557114E-5
-3.415346E-5
1.9788742E-5
-2.270937E-5
Unique sums count 795 from -3.4868717E-5 to 3.1232834E-5

0

Peter lawrey 08 Mar 12 at 22:42

source to share

FauxFaux · Accepted Answer · 2012-03-08T21:24:34+0000

The order on the map changes, as a result of which operations are performed in a different order. An example of changing the output for a simple calculation (note the flipped d and e):

class WTF {
    public static void main(String[] args) {
        final float a = 0.42890447f * 0.37233013f;
        final float b = 0.2648958f * 0.05867535f;
        final float c = 0.8928169f * 0.7546882f;
        final float d = 0.0039135218f * 0.59395087f;
        final float e = 0.9114683f * 0.33522367f;

        System.out.println(a + b + c + d + e);
        System.out.println(a + b + c + e + d);
    }
}

Why does Java's HashMap copy constructor affect floating point precision?

More articles: