Difference in Python multiprocessing pool performance on two different machines

Question

Difference in Python multiprocessing pool performance on two different machines

So I deployed the same code on 2 different machines in the same python virtual env, OS / kernel is exactly the same and hard disk model is the same. The only significant difference between the two machines is the processors. Machine 1 has 2x Xeon E5-2690 (16 cores = 2.90 GHz) and machine 2 has 1x Xeon W3690 (6 cores at 3.47 GHz).

Now, when I run a version of the code that does not use the multiprocessing pool, machine 1 will run faster. However, when using a multiprocessor pool, machine 2 will run significantly faster (more than 6 times). In fact machine 1 will not run much faster than the single threaded version, no matter how many threads I create.

This process simply reads HDF5 files and does some basic math on the data.

I was prompted to run strace -c, my results show that a little more time was spent on futex machine. However, since I only ran it once, there is no real statistical certainty there.

I'm sure the issue is related to the overhead created by multiprocessing, however this is a rather large differential. I also find it hard to believe that .57 GHz would lead to a big mismatch. Any ideas?

Thank!

EDIT:

So, here's a test I ran without having to deal with IO:

Machine 1:

In [1]: import numpy as np

In [2]: import multiprocessing

In [3]: def gen_rand(x):
        return np.random.random(x)
   ...: 

In [4]: pool = multiprocessing.Pool(6)

In [5]: proc_arg = 100*[100000]

In [6]: %timeit -n30 pool.map(gen_rand, proc_arg)
30 loops, best of 3: 254 ms per loop

Machine 2:

In [1]: import numpy as np

In [2]: import multiprocessing

In [3]: def gen_rand(x):
        return np.random.random(x)
   ...: 

In [4]: pool = multiprocessing.Pool(6)

In [5]: proc_arg = 100*[100000]

In [6]: %timeit -n30 pool.map(gen_rand, proc_arg)
30 loops, best of 3: 133 ms per loop

+3

python intel python-multiprocessing

ast4 22 Sep '14 at 9:30

source to share

No one has answered this question yet

Check out similar questions:

3119

What is the difference between Python list methods that are appended and expanded?

2047

How do I concatenate two lists in Python?

1056

Convert two lists to a dictionary

980

Use a different Python version with virtualenv

715

Multiprocessing vs Threading Python

259

Thread pool similar to multiprocessing pool?

2

python multiprocessing. Pool stuck after long execution

2

An example of combining multiple processes (parallel) is slower than sequential. Trying to understand pool in python

1

Python multiprocessing pool freezes on one computer as it runs on another?

0

Optimizing a simple cpu bind function in python multiprocessing

Difference in Python multiprocessing pool performance on two different machines

More articles: