Java Arraylist to Map comparison

I'm trying to convert an ArrayList to a Treemap, so I wrote a kind of test to compare different methods:

1) toMap in a parallel stream

2) toMap on stream

3) for everyone in the stream

4) for each parallel stream

5) for the cycle

This works well if the size of the ArrayList is small (say 10,000), but when the size is large, say a million, even after 9 minutes its "for parallel stream" method still works. I surrounded him with an attempt to catch, but he was clean.

I know there will be a bit of overhead to create a new thread, but parallelStream uses threadPool to keep it low, right?

public class Set {
private String foo;
private int bar;

public Set(String foo, int bar) {
    this.foo = foo;
    this.bar = bar;
}

public String getFoo() {
    return foo;
}

public void setFoo(String foo) {
    this.foo = foo;
}

public int getBar() {
    return bar;
}

public void setBar(int bar) {
    this.bar = bar;
}

}

      

home

public class Test {
TreeMap tr=new TreeMap();

public static void main(String[] args) {
    Test t = new Test();
    t.g();
}

public void g(){
    ArrayList<Set> ar=new ArrayList<>();
    for (int i = 0; i < 1_000_000; i++) {
        ar.add(new Set(UUID.randomUUID().toString(), new Random().nextInt()));
    }
    long start;
    long end;
    System.out.println("Parallel toMap");
    start=System.nanoTime();
    tr.putAll(ar.parallelStream().collect(Collectors.toMap(Set::getFoo, Set::getBar)));
    end=System.nanoTime();
    System.out.println(end-start);

    tr=new TreeMap();
    System.out.println("non-Parallel toMap");
    start=System.nanoTime();
    tr.putAll(ar.stream().collect(Collectors.toMap(Set::getFoo, Set::getBar)));
    end=System.nanoTime();
    System.out.println(end-start);

    tr=new TreeMap();
    System.out.println("non-Parallel forEach");
    start=System.nanoTime();
    ar.stream().forEach(product -> {
            tr.put(product.getFoo(), product.getBar());
        });
    end=System.nanoTime();
    System.out.println(end-start);

    tr=new TreeMap();
    System.out.println("Parallel forEach");
    start=System.nanoTime();
//HANGS SOMEWHERE HERE
    ar.parallelStream().forEach(product -> {
        try {
            tr.put(product.getFoo(), product.getBar());
        } catch (Exception e) {
            System.out.println(e.getLocalizedMessage());
        }

        });
    end=System.nanoTime();
    System.out.println(end-start);

    tr=new TreeMap();
    System.out.println("non-Parallel loop");
    start=System.nanoTime();
    for(Set product:ar)
        tr.put(product.getFoo(), product.getBar());

    end=System.nanoTime();
    System.out.println(end-start);
    }
}

      

the output for size 10_000 looks like this

Parallel toMap
130793206
non-Parallel toMap
21729202
non-Parallel forEach
7601349
Parallel forEach
3233395
non-Parallel loop
9744039

      

'for loop' is slowest as expected

'paralled forEach' is faster than "non-parallel forEach" as expected

'parallel toMap' is 5X slower than non-parallel to map ?? what? Intel turbo boost in the game?

get back to the point why "forEach in parallel steam" fails when the arraylist is large?

i7 2670QM works, so threadPool size should be 8

+3


source to share


1 answer


TreeMap

is not thread safe. Therefore, when you use it from multiple streams, all bets are disabled. You can get an infinite loop in HashMap

. Presumably TreeMap

behaves pretty badly in some way.



(When benchmarking: because of the way JVMs are "warming up", you must start a new process for each test. Also run the test multiple times in a row within the same process.)

+1


source







All Articles