Node js globals within modules

2019-04-12 08:10发布

In node I see variables initialized global inside modules are getting mixed up [changes done by one request affects the other] across requests. For Ex:

a.js

var a;
function printName(req, res) {
  //get param `name` from url;
  a = name;
  res.end('Hi '+a);
}
module.exports.printName = printName;

index.js

//Assume all createServer stuffs are done and following function as a CB to createServer
function requestListener(req, res) {
  var a = require('a');
  a.printName(req, res);
}

As per my assumption, printName function exported from module 'a' is executed everytime a new request hits node and it will have different scope object everytime.

So, having something global inside a module wouldn't be affecting them across requests.

But I see that isn't the case. Can anyone explain how does node handle module exports of functions in specific [way it handles the scope of the cached module exports object] and how to overcome this shared global variables across requests within a module?

Edit [We do async task per request]: With rapid requests in our live system. Which basically query redis and responds the request. We see wrong response mapped to wrong request (reply [stored in a global var in the module] of a redis look up wrongly mapped to diff req). And also we have some default values as global vars which can be overridden based on request params. Which also is getting screwed up

3条回答
淡お忘
2楼-- · 2019-04-12 08:14

It really depends when in the process do you assign to name.

if between assigning the name to calling requestListener, there is an async method, then you we'll have "race conditions" (I.E. two threads changing the same object at the same time) even though node.js is single-threaded.
this is because node.js will start processing a new request while the async method is running in the background.

for example look at the following sequence:

request1 starts processing, sets name to 1
request1 calls an async function 
node.js frees the process, and handles the next request in queue.
request2 starts processing, sets name to 2
request2 calls an async function
node.js frees the process, the async function for request 1 is done, so it calls the callback for this function.
request1 calls requestListener, however at this point name is already set to 2 and not 1.

dealing with Async function in Node.js is very similar to multi-threaded programming, you must take care to encapsulate your data. in general you should try to avoid using Global object, and if you do use them, they should be either: immutable or self-contained.

Global objects shouldn't be used to pass state between functions (which is what you are doing).

The solution to your problems should be to put the name global inside an object, the suggested places are inside the request object, which is passed to all most all functions in the request processing pipelie (this is what connect.js,express.js and all the middleware are doing), or within a session (see connect.js session middleware), which would allow you to persist data between different requests from the same user.

查看更多
【Aperson】
3楼-- · 2019-04-12 08:20

The first step to understanding what is happening is understanding what's happening behind the scenes. From a language standpoint, there's nothing special about node modules. The 'magic' comes from how node loads files from disk when you require.

When you call require, node either synchronously reads from disk or returns the module's cached exports object. When reading files, it follows a set of somewhat complex rules to determine exactly which file is read, but once it has a path:

  1. Check if require.cache[moduleName] exists. If it does, return that and STOP.
  2. code = fs.readFileSync(path).
  3. Wrap (concatenate) code with the string (function (exports, require, module, __filename, __dirname) { ... });
  4. eval your wrapped code and invoke the anonymous wrapper function.

    var module = { exports: {} };
    eval(code)(module.exports, require, module, path, pathMinusFilename);
    
  5. Save module.exports as require.cache[moduleName].

The next time you require the same module, node simply returns the cached exports object. (This is a very good thing, because the initial loading process is slow and synchronous.)

So now you should be able to see:

  • Top-level code in a module is only executed once.
  • Since it is actually executed in an anonymous function:
    • 'Global' variables aren't actually global (unless you explicitly assign to global or don't scope your variables with var)
    • This is how a module gets a local scope.

In your example, you require module a for each request, but you're actually sharing the same module scope across all requrests because of the module caching mechanism outlined above. Every call to printName shares the same a in its scope chain (even though printName itself gets a new scope on each invocation).

Now in the literal code you have in your question, this doesn't matter: you set a and then use it on the very next line. Control never leaves printName, so the fact that a is shared is irrelevant. My guess is your real code looks more like:

var a;
function printName(req, res) {
  //get param `name` from url;
  a = name;
  getSomethingFromRedis(function(result) {
      res.end('Hi '+a);
  });
}
module.exports.printName = printName;

Here we have a problem because control does leave printName. The callback eventually fires, but another request changed a in the meantime.

You probably want something more like this:

a.js

module.exports = function A() {
    var a;
    function printName(req, res) {
      //get param `name` from url;
      a = name;
      res.end('Hi '+a);
    }

    return {
        printName: printName
    };
}

index.js

var A = require('a');
function requestListener(req, res) {
  var a = A();
  a.printName(req, res);
}

This way, you get a fresh and independent scope inside of A for each request.

查看更多
啃猪蹄的小仙女
4楼-- · 2019-04-12 08:31

Modules were designed for run once and cache the module, that, combined with node's asynchronous nature means about 50% of the time res.end('Hi '+a) executes before a = name (because a is known).

Ultimately it boils down to one simple fact of JavaScript: global vars are evil. I would not use a global unless it never gets overridden by requests.

查看更多
登录 后发表回答