Dask Distributed - how to run one task per worker so that that task runs on all cores available to the worker?

I am very new to using the distributed

python library . I have 4 employees and I have successfully run multiple parallel runs using 14 cores (among 16 available) for each worker, resulting in 4 * 14 = 56 tasks running in parallel.

But how to act if I would like only one question at once in each employee. Thus, I expect one task to use 14 cores in parallel with the production one.

+3


source to share


1 answer


Dask workers maintain a single pool of threads that they use to run tasks. Each task always consumes one thread from this pool. You cannot say that the task is consuming many threads from this pool.

However, there are other ways to control and limit concurrency within duck workers. In your case, you can define work resources . This would allow stopping many large tasks at the same time from the same worker.

In the following example, we define that each worker has one resource Foo

and that each task requires one Foo

to run. This will stop the same worker from performing two tasks at the same time.



dask-worker scheduler-address:8786 --resources Foo=1
dask-worker scheduler-address:8786 --resources Foo=1

      

...

from dask.distributed import Client
client = Client('scheduler-address:8786')
futures = client.map(my_expensive_function, ..., resources={'Foo': 1})

      

+2


source







All Articles