Shared array usage in Julia

2019-04-06 20:30发布

I need to parallelise a certain task over a number of workers. To that purpose I need all workers to have access to a matrix that stores the data.

I thought that the data matrix could be implemented as a Shared Array in order to minimise data movement.

In order to get me started with Shared Arrays, I am trying the following very simple example which gives me, what I think is, unexpected behaviour:

julia -p 2

# the data matrix
D = SharedArray(Float64, 2, 3)

# initialise the data matrix with dummy values
for ii=1:length(D)
   D[ii] = rand()
end

# Define some kind of dummy computation involving the shared array 
f = x -> x + sum(D)

# call function on worker
@time fetch(@spawnat 2 f(1.0))

The last command gives me the following error:

 ERROR: On worker 2:
 UndefVarError: D not defined
 in anonymous at none:1
 in anonymous at multi.jl:1358
 in anonymous at multi.jl:904
 in run_work_thunk at multi.jl:645
 in run_work_thunk at multi.jl:654
 in anonymous at task.jl:58
 in remotecall_fetch at multi.jl:731
 in call_on_owner at multi.jl:777
 in fetch at multi.jl:795

I thought that the Shared Array D should be visible to all workers? I am clearly missing something basic. Thanks in advance.

2条回答
三岁会撩人
2楼-- · 2019-04-06 21:14

This works, without declaring D, through a closure within a function.

function dothis()
    D = SharedArray{Float64}(2, 3)

    # initialise the data matrix with dummy values
    for ii=1:length(D)
       D[ii] = ii #not rand() anymore
    end

    # Define some kind of dummy computation involving the shared array 
    f = x -> x + sum(D)

    # call function on worker
    @time fetch(@spawnat 2 f(1.0))
end

julia> dothis()
1.507047 seconds (206.04 k allocations: 11.071 MiB, 0.72% gc time)
22.0
julia> dothis()
0.012596 seconds (363 allocations: 19.527 KiB)
22.0

So though I have answered the OP's question, and the SharedArray is visible to all workers -- is this legitimate?

查看更多
来,给爷笑一个
3楼-- · 2019-04-06 21:36

Although the underlying data is shared to all workers, the declaration of D is not. You will still need to pass in the reference to D, so something like

f = (x,SA) -> x + sum(SA) @time fetch(@spawnat 2 f(1.0,D))

should work. You can change D on the main process and see that it is infact using the same data:

julia> # call function on worker
       @time fetch(@spawnat 2 f(1.0,D))
  0.325254 seconds (225.62 k allocations: 9.701 MB, 5.88% gc time)
4.405613684678047

julia> D[1] += 1
1.2005544517241717

julia> # call function on worker
       @time fetch(@spawnat 2 f(1.0,D))
  0.004548 seconds (637 allocations: 45.490 KB)
5.405613684678047
查看更多
登录 后发表回答