“Proxying” a lot of HTTP requests with Node.js + E

2020-07-17 06:16发布

问题:

I'm writing proxy in Node.js + Express 2. Proxy should:

  1. decrypt POST payload and issue HTTP request to server based on result;
  2. encrypt reply from server and send it back to client.

Encryption-related part works fine. The problem I'm facing is timeouts. Proxy should process requests in less than 15 secs. And most of them are under 500ms, actually.

Problem appears when I increase number of parallel requests. Most requests are completed ok, but some are failed after 15 secs + couple of millis. ab -n5000 -c300 works fine, but with concurrency of 500 it fails for some requests with timeout.

I could only speculate, but it seems thant problem is an order of callbacks exectuion. Is it possible that requests that comes first are hanging until ETIMEDOUT because of node's focus in latest ones which are still being processed in time under 500ms.

P.S.: There is no problem with remote server. I'm using request for interactions with it.

upd

The way things works with some code:

function queryRemote(req, res) {
  var options = {};  // built based on req object (URI, body, authorization, etc.)
  request(options, function(err, httpResponse, body) {
    return err ? send500(req, res)
               : res.end(encrypt(body));
  });
}

app.use(myBodyParser);  // reads hex string in payload
                        // and calls next() on 'end' event

app.post('/', [checkHeaders,   // check Content-Type and Authorization headers
               authUser,       // query DB and call next()
               parseRequest],  // decrypt payload, parse JSON, call next()
         function(req, res) {
  req.socket.setTimeout(TIMEOUT);
  queryRemote(req, res);
});

My problem is following: when ab issuing, let's say, 20 POSTs to /, express route handler gets called like thousands of times. That's not always happening, sometimes 20 and only 20 requests are processed in timely fashion.

Of course, ab is not a problem. I'm 100% sure that only 20 requests sent by ab. But route handler gets called multiple times.

I can't find reasons for such behaviour, any advice?

回答1:

Timeouts were caused by using http.globalAgent which by default can process up to 5 concurrent requests to one host:port (which isn't enough in my case).

Thouthands of requests (instead of tens) were sent by ab (Wireshark approved fact under OS X; I can not reproduce this under Ubuntu inside Parallels).



回答2:

You can have a look at node-http-proxy module and how it handles the connections. Make sure you don't buffer any data and everything works by streaming. And you should try to see where is the time spent for those long requests. Try instrumenting parts of your code with conosle.time and console.timeEnd and see where is taking the most time. If the time is mostly spent in javascript you should try to profile it. Basically you can use v8 profiler, by adding --prof option to your node command. Which makes a v8.log and can be processed via a v8 tool found in node-source-dir/deps/v8/tools. It only works if you have installed d8 shell via scons(scons d8). You can have a look at this article to help you further to make this working.

You can also use node-webkit-agent which uses webkit developer tools to show the profiler result. You can also have a look at my fork with a bit of sugar.

If that didn't work, you can try profiling with dtrace(only works in illumos-based systems like SmartOS).