Can we limit the throughput of a luigi Task?

2019-07-01 17:33发布

问题:

We have a Luigi Task that request a piece of information from a 3rd party service. We are limited on the number of call requests we can perform per minute to that API call.

Is there a way to specify on a per-Task basis how many tasks of this kind must the scheduler run per unit of time?

回答1:

We implemented our own rate limiting in the task. Our API limit was low enough that we could saturate it with a single thread. When we received a rate limit response, we just back off and retry.

One thing you can do is to declare the API call as a resource. You can set how many of the resource is available in the config, and then how many of the resource the task consumes as a property on the task. This will then limit you to running n of that task at a time.

in config:

[resources]
api=1

in code for Task:

resources = {"api": 1}