I need to parallelise a certain task over a number of workers. To that purpose I need all workers to have access to a matrix that stores the data.
I thought that the data matrix could be implemented as a Shared Array in order to minimise data movement.
In order to get me started with Shared Arrays, I am trying the following very simple example which gives me, what I think is, unexpected behaviour:
julia -p 2
# the data matrix
D = SharedArray(Float64, 2, 3)
# initialise the data matrix with dummy values
for ii=1:length(D)
D[ii] = rand()
end
# Define some kind of dummy computation involving the shared array
f = x -> x + sum(D)
# call function on worker
@time fetch(@spawnat 2 f(1.0))
The last command gives me the following error:
ERROR: On worker 2:
UndefVarError: D not defined
in anonymous at none:1
in anonymous at multi.jl:1358
in anonymous at multi.jl:904
in run_work_thunk at multi.jl:645
in run_work_thunk at multi.jl:654
in anonymous at task.jl:58
in remotecall_fetch at multi.jl:731
in call_on_owner at multi.jl:777
in fetch at multi.jl:795
I thought that the Shared Array D should be visible to all workers? I am clearly missing something basic. Thanks in advance.
This works, without declaring D, through a closure within a function.
So though I have answered the OP's question, and the SharedArray is visible to all workers -- is this legitimate?
Although the underlying data is shared to all workers, the declaration of
D
is not. You will still need to pass in the reference to D, so something likef = (x,SA) -> x + sum(SA) @time fetch(@spawnat 2 f(1.0,D))
should work. You can change D on the main process and see that it is infact using the same data: