Python Multiprocessing Numpy Random

Is the scope of numpy ndarray differently in a function called multiprocessing? Here's an example:

Using the python multiprocessing module, I call a function like this:

for core in range(cores):
    #target could be f() or g()
    proc = mp.Process(target=f, args=(core))
    jobs.append(proc)
for job in jobs:
    job.start()
for job in jobs:
    job.join()

def f(core):
    x = 0
    x += random.randint(0,10)
    print x

def g(core):
    #Assume an array with 4 columns and n rows
    local = np.copy(globalshared_array[:,core])
    shuffled = np.random.permutation(local)

      

By calling f(core)

, the variable x

is local to the process, i.e. it prints another, random integer as expected. They never exceed 10, indicating that x=0

in each process. It is right?

Calling g(core)

and rearranging a copy of the array returns 4 identical "shuffled" arrays. This appears to indicate that the working copy is not a local child process. It's right? If so, other than using sharedmemory space, is it possible for the ndarray to be local to the child process when it needs to be populated from the shared memory area?

EDIT:

Changing the value g(core)

to add a random integer means it has the desired effect. The array shows a different value. Something must be happening in permutation

that randomly orders the columns (local to each child process) with the same ideas?

def g(core):
    #Assume an array with 4 columns and n rows
    local = np.copy(globalshared_array[:,core])
    local += random.randint(0,10)

      

EDIT II: np.random.shuffle

also exhibits the same behavior. The contents of the array are shuffled but dragged to the same value on each core.

+3


source to share


2 answers


Calling g (core) and rearranging the copy of the array returns 4 identical "shuffled" arrays. This appears to indicate that the working copy is not a local child process.



This likely indicates that the random number generator is initialized the same way in every child process, producing the same sequence. You need to seed each child generator (perhaps by throwing the child process id into the mix).

+5


source


For seeding a random array, this post was most helpful. The following function g(core)

managed to create a random permutation for each core.



def g(core):
    pid = mp.current_process()._identity[0]
    randst = np.random.mtrand.RandomState(pid)
    randarray = randst.randint(0,100, size=(1,100)

      

+4


source







All Articles