Difference in Python multiprocessing pool performance on two different machines

So I deployed the same code on 2 different machines in the same python virtual env, OS / kernel is exactly the same and hard disk model is the same. The only significant difference between the two machines is the processors. Machine 1 has 2x Xeon E5-2690 (16 cores = 2.90 GHz) and machine 2 has 1x Xeon W3690 (6 cores at 3.47 GHz).

Now, when I run a version of the code that does not use the multiprocessing pool, machine 1 will run faster. However, when using a multiprocessor pool, machine 2 will run significantly faster (more than 6 times). In fact machine 1 will not run much faster than the single threaded version, no matter how many threads I create.

This process simply reads HDF5 files and does some basic math on the data.

I was prompted to run strace -c, my results show that a little more time was spent on futex machine. However, since I only ran it once, there is no real statistical certainty there.

I'm sure the issue is related to the overhead created by multiprocessing, however this is a rather large differential. I also find it hard to believe that .57 GHz would lead to a big mismatch. Any ideas?

Thank!

EDIT:

So, here's a test I ran without having to deal with IO:

Machine 1:

In [1]: import numpy as np

In [2]: import multiprocessing

In [3]: def gen_rand(x):
        return np.random.random(x)
   ...: 

In [4]: pool = multiprocessing.Pool(6)

In [5]: proc_arg = 100*[100000]

In [6]: %timeit -n30 pool.map(gen_rand, proc_arg)
30 loops, best of 3: 254 ms per loop

      

Machine 2:

In [1]: import numpy as np

In [2]: import multiprocessing

In [3]: def gen_rand(x):
        return np.random.random(x)
   ...: 

In [4]: pool = multiprocessing.Pool(6)

In [5]: proc_arg = 100*[100000]

In [6]: %timeit -n30 pool.map(gen_rand, proc_arg)
30 loops, best of 3: 133 ms per loop

      

+3


source to share





All Articles