Python Multiprocessing Numpy Random

Question

Python Multiprocessing Numpy Random

Is the scope of numpy ndarray differently in a function called multiprocessing? Here's an example:

Using the python multiprocessing module, I call a function like this:

for core in range(cores):
    #target could be f() or g()
    proc = mp.Process(target=f, args=(core))
    jobs.append(proc)
for job in jobs:
    job.start()
for job in jobs:
    job.join()

def f(core):
    x = 0
    x += random.randint(0,10)
    print x

def g(core):
    #Assume an array with 4 columns and n rows
    local = np.copy(globalshared_array[:,core])
    shuffled = np.random.permutation(local)

By calling f(core)

, the variable x

is local to the process, i.e. it prints another, random integer as expected. They never exceed 10, indicating that x=0

in each process. It is right?

Calling g(core)

and rearranging a copy of the array returns 4 identical "shuffled" arrays. This appears to indicate that the working copy is not a local child process. It's right? If so, other than using sharedmemory space, is it possible for the ndarray to be local to the child process when it needs to be populated from the shared memory area?

EDIT:

Changing the value g(core)

to add a random integer means it has the desired effect. The array shows a different value. Something must be happening in permutation

that randomly orders the columns (local to each child process) with the same ideas?

def g(core):
    #Assume an array with 4 columns and n rows
    local = np.copy(globalshared_array[:,core])
    local += random.randint(0,10)

EDIT II: np.random.shuffle

also exhibits the same behavior. The contents of the array are shuffled but dragged to the same value on each core.

+3

python numpy multiprocessing

Jzl5325 Jan 24 At 15:27

source to share

2 answers

For seeding a random array, this post was most helpful. The following function g(core)

managed to create a random permutation for each core.

def g(core):
    pid = mp.current_process()._identity[0]
    randst = np.random.mtrand.RandomState(pid)
    randarray = randst.randint(0,100, size=(1,100)

+4

Jzl5325 Jan 24 13 at 16:21

source to share

NPE · Accepted Answer · 2013-01-24T16:09:54+0000

Calling g (core) and rearranging the copy of the array returns 4 identical "shuffled" arrays. This appears to indicate that the working copy is not a local child process.

This likely indicates that the random number generator is initialized the same way in every child process, producing the same sequence. You need to seed each child generator (perhaps by throwing the child process id into the mix).

Python Multiprocessing Numpy Random

More articles: