Is there a faster way to only retrieve a single element in a large published array with Dask without retrieving the entire array?
In the example below client.get_dataset('array1')[0] takes roughly the same time as client.get_dataset('array1').
import distributed
client = distributed.Client()
data = [1]*10000000
payload = {'array1': data}
client.publish(**payload)
one_element = client.get_dataset('array1')[0]
Note that anything you publish goes to the scheduler, not to the workers, so this is somewhat inefficient. Publish was intended to be used with Dask collections like dask.array.
Client 1
Client 2