Julia distributes function: specifying distributed dimension

I am interested in allocating an MxN integer array to p-workers. Is there a way to determine which dimension is being allocated? Specifically, I want to keep the number of rows M fixed and spread across N columns. In my case M> N (I have a matrix of terms-documents with a dictionary of size M and number of documents N).

By default Julia seems to be distributed at the largest size, which doesn't work for my application (I want to distribute documents, not a dictionary). Is there a way to control which dimension is allocated?

+3


source to share


1 answer


Constructor

SharedArray

has an optional parameter pids

that maps items to processes (see documentation ).

So, the MxN matrix can be initialized with the following code:



# a helper function which might be useful in other contexts
function balancedfill(v,n,b)
    d,r = divrem(n,b)
    return v[[repeat(1:r,inner=d+1);repeat(r+1:b,inner=d)]]
end

# N,M = size(mat)
pidvec = repeat(balancedfill(1:nprocs(),N,nprocs()),inner=M)

sharedmat = SharedArray{Float64}((N,M); pids=pidvec)

      

This creates a generic Float64 array with columns balanced across processes. Float64 can be replaced with the required element type. With a little change (when switching inner

from outer

and N

from M

to pidvec

), a distributed array across rows can be created.

+1


source







All Articles