Throttle and queue up API requests due to per seco

2019-01-21 11:12发布

问题:

I'm use mikeal/request to make API calls. One of the API's I use most frequently (the Shopify API). Recently put out a new call limit, I'm seeing errors like:

Exceeded 6.0 calls per second for api client. Slow your requests or contact support for higher limits.

I've already gotten an upgrade, but regardless of how much bandwidth I get I have to account for this. A large majority of the requests to the Shopify API are within async.map() functions, which loop asynchronous requests, and gather the bodies.

I'm looking for any help, perhaps a library that already exists, that would wrap around the request module and actually block, sleep, throttle, allocate, manage, the many simultaneous requests that are firing off asynchronously and limit them to say 6 requests at a time. I have no problem with working on such a project if it doesn't exist. I just don't know how to handle this kind of situation, and I'm hoping for some kind of standard.

I made a ticket with mikeal/request.

回答1:

I've run into the same issue with various APIs. AWS is famous for throttling as well.

A couple of approaches can be used. You mentioned async.map() function. Have you tried async.queue()? The queue method should allow you to set a solid limit (like 6) and anything over that amount will be placed in the queue.

Another helpful tool is oibackoff. That library will allow you to backoff your request if you get an error back from the server and try again.

It can be useful to wrap the two libraries to make sure both your bases are covered: async.queue to ensure you don't go over the limit, and oibackoff to ensure you get another shot at getting your request in if the server tells you there was an error.



回答2:

For an alternative solution, I used the node-rate-limiter to wrap the request function like this:

var request = require('request');
var RateLimiter = require('limiter').RateLimiter;

var limiter = new RateLimiter(1, 100); // at most 1 request every 100 ms
var throttledRequest = function() {
    var requestArgs = arguments;
    limiter.removeTokens(1, function() {
        request.apply(this, requestArgs);
    });
};


回答3:

The npm package simple-rate-limiter seems to be a very good solution to this problem.

Moreover, it is easier to use than node-rate-limiter and async.queue.

Here's a snippet that shows how to limit all requests to ten per second.

var limit = require("simple-rate-limiter");
var request = limit(require("request")).to(10).per(1000);


回答4:

In async module, this requested feature is closed as "wont fix"

  • Reason given in 2016 is "managing that kind of construct properly is a hard problem." See right side of here: https://github.com/caolan/async/issues/1314
  • Reason given in 2013 is "wouldn't scale to multiple processes" See: https://github.com/caolan/async/issues/37#issuecomment-14336237

There is a solution using leakybucket or token bucket model, it is implemented "limiter" npm module as RateLimiter.

RateLimiter, see example here: https://github.com/caolan/async/issues/1314#issuecomment-263715550

Another way is using PromiseThrottle, I used this, working example is below:

var PromiseThrottle = require('promise-throttle');
let RATE_PER_SECOND = 5; // 5 = 5 per second, 0.5 = 1 per every 2 seconds

var pto = new PromiseThrottle({
    requestsPerSecond: RATE_PER_SECOND, // up to 1 request per second
    promiseImplementation: Promise  // the Promise library you are using
});

let timeStart = Date.now();
var myPromiseFunction = function (arg) {
    return new Promise(function (resolve, reject) {
        console.log("myPromiseFunction: " + arg + ", " + (Date.now() - timeStart) / 1000);
        let response = arg;
        return resolve(response);
    });
};

let NUMBER_OF_REQUESTS = 15;
let promiseArray = [];
for (let i = 1; i <= NUMBER_OF_REQUESTS; i++) {
    promiseArray.push(
            pto
            .add(myPromiseFunction.bind(this, i)) // passing am argument using bind()
            );
}

Promise
        .all(promiseArray)
        .then(function (allResponsesArray) { // [1 .. 100]
            console.log("All results: " + allResponsesArray);
        });

Output:

myPromiseFunction: 1, 0.031
myPromiseFunction: 2, 0.201
myPromiseFunction: 3, 0.401
myPromiseFunction: 4, 0.602
myPromiseFunction: 5, 0.803
myPromiseFunction: 6, 1.003
myPromiseFunction: 7, 1.204
myPromiseFunction: 8, 1.404
myPromiseFunction: 9, 1.605
myPromiseFunction: 10, 1.806
myPromiseFunction: 11, 2.007
myPromiseFunction: 12, 2.208
myPromiseFunction: 13, 2.409
myPromiseFunction: 14, 2.61
myPromiseFunction: 15, 2.811
All results: 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15

We can clearly see the rate from output, i.e. 5 calls for every second.



回答5:

Here's my solution use a library request-promise or axios and wrap the call in this promise.

var Promise = require("bluebird")

// http://stackoverflow.com/questions/28459812/way-to-provide-this-to-the-global-scope#28459875
// http://stackoverflow.com/questions/27561158/timed-promise-queue-throttle

module.exports = promiseDebounce

function promiseDebounce(fn, delay, count) {
  var working = 0, queue = [];
  function work() {
    if ((queue.length === 0) || (working === count)) return;
    working++;
    Promise.delay(delay).tap(function () { working--; }).then(work);
    var next = queue.shift();
    next[2](fn.apply(next[0], next[1]));
  }
  return function debounced() {
    var args = arguments;
    return new Promise(function(resolve){
      queue.push([this, args, resolve]);
      if (working < count) work();
    }.bind(this));
  }


回答6:

The other solutions were not up to my tastes. Researching further, I found promise-ratelimit which gives you an api that you can simply await:

var rate = 2000 // in milliseconds
var throttle = require('promise-ratelimit')(rate)

async function queryExampleApi () {
  await throttle()
  var response = await get('https://api.example.com/stuff')
  return response.body.things
}

The above example will ensure you only make queries to api.example.com every 2000ms at most. In other words, the very first request will not wait 2000ms.