I am trying to process some files using a python function and would like to parallelize the task on a PBS cluster using dask. On the cluster I can only launch one job but have access to 10 nodes with 24 cores each.
So my dask PBSCluster looks like:
import dask
from dask_jobqueue import PBSCluster
cluster = PBSCluster(cores=240,
cluster.scale(1) # one worker
from dask.distributed import Client
client = Client(cluster)
After the Cluster in Dask shows 1 worker with 240 cores (not sure if that make sense). When I run
result = compute(*foo, scheduler='distributed')
and access the allocated nodes only one of them is actually running the computation. I am not sure if I using the right PBS configuration.
The values you give to the Dask Jobqueue constructors are the values for a single job for a single node. So here you are asking for a node with 240 cores, which probably doesn't make sense today.
If you can only launch one job then dask-jobqueue's model probably won't work for you. I recommnd looking at dask-mpi as an alternative.