I'm trying to run untrusted javascript code in linux + node.js with the sandbox module but it's broken, all i need is to let users write javascript programs that printout some text. No other i/o is allowed and just plain javascript is to be used, no other node modules. If it's not really possible to do, what other language do you suggest for this kind of task? The minimal feature set i need is some math, regexes, string manipulation, and basic JSON functions. Scripts will run for let's say 5 seconds tops and then the process would be killed, how can i achieve that?
问题:
回答1:
I've recently created a library for sandboxing the untrusted code, it seems to fit the demands (executes a code in a restricted process in case of Node.js, and in a Worker inside a sandboxed iframe for a web-browser):
https://github.com/asvd/jailed
There is an opportunity to export the given set of methods from the main application into the sandbox thus providing any custom API and set of privilliges (that feature was actually the reason why I decided to make a library from scratch). The mentioned maths, regexp and string -related stuff is provided by the JavaScript itself, anything additional may be explicitly exported from outside (like some function for communicating with the main application).
回答2:
The basic idea of sandboxes is, you need variables predefined as globals to do stuff, so if you deny a script them by unsetting them, or replacing them with controlled one, it cannot escape. As long you don't forget anything.
First replace deny require() or replace it with something controlled. dont forget about process and "global" a.k.a "root", the difficult thing is not to forget anything, thats why its good to rely on someone else having built a sandbox ;-)
回答3:
If you can afford the performance hit, you could run the JS in a throwaway virtual machine with the appropriate CPU and memory limits.
Of course, then you are trusting the security of the VM solution. By using it together with an ordinary JS sandbox, you'd have two layers of security.
For an additional layer, put the sandbox on a different physical machine than your main app.
回答4:
Docker.io Is an awesome new kid on the block, which uses LXCs and CGroups to create sandboxes.
Here is one implementation of an online gist (similar to codepad.org) using Docker and Go Lang
This just goes to demonstrate that one can safely run untrusted code written in many programming languages inside Docker Containers, including node.js
回答5:
Know its pretty late to answer the question, guess the below tool might be a value add which is not mentioned in the above answers/comments.
Trying to implement similar use-case. After have gone through the web resources, https://www.npmjs.com/package/vm2 seems to be handling the sandbox environment(nodejs) pretty well.
It's pretty much satisfies the sandboxing features like restricting the access to builtin or external modules, data exchanges between sandbox, etc.
回答6:
Ask yourself these questions:
- Are you one of the smartest persons on the planet?
- Do you turn down job offers by Google, Mozilla and Kaspersky Lab routinely because it would bore you?
- Does the "untrusted code" come from people working at the same company as you or from criminals and bored computer kids all over the globe?
- Are you sure that node.js has no security holes that could leak through your sandbox?
- Can you write perfect 100% error free code?
- Do you know everything about JavaScript?
As you already know by your experiments with the sandbox module, writing your own sandbox isn't trivial. The main problem with sandboxes is that you must get everything right. One mistake will ruin your security completely which is why browser developers fight a constant battle with crackers all over the globe.
That said, simple sandboxes are pretty easy to do. First, you'll need to write your own JavaScript interpreter because you can't use the one from node.js because of eval()
and require()
(both would allow crackers to escape your sandbox).
The interpreter must make sure that the interpreted code cannot access anything besides the few global symbols that you provide. This means there can't be an eval()
function, for example (or you must make sure that this function is only evaluated in the context of your own JavaScript interpreter).
Drawback of this approach: A lot of work and if you make a mistake in your interpreter, the crackers can leave the sandbox.
Another approach is to clean the code and run it with node.js's eval()
. You can clean existing code by running a bunch of regexp's over it like /eval\s*[(]//g
to remove malicious code parts.
Drawback of this approach: It's easy to make a mistake that will leave you vulnerable to an attack. For example, there might be mismatch between what regexp and what node.js think of as "whitespace". Some obscure unicode whitespace might be accepted by the interpreter but not by regexp which would allow an attacker to run eval()
.
My suggestion: Write a small demo test case that shows how the sandbox module is broken and have it fixed. It will save you a lot of time and effort and if there is a bug in the sandbox, it won't be your fault (well, not entirely at least).
回答7:
I am facing a similar problem right now and I'm reading only bad things about the sandbox module.
If you don't need anything specific to the node environment, I thing the best approach will be to use a headless browser such as PhantomJS or Chimera to use as a sandbox environment.
回答8:
A late answer but maybe an interesting idea.
Static code analysis => AST manipulation => Code generating
- Static analysis will parse the AST of the source code. AST provides a common data structure to allow us to traverse and modify the source code.
- Via AST manipulations, we can find out all the identifier references to any sensitive variables in the outer scopes. If we need, we can re-declare and initialize them at the beginning of the function body, so as to overwrite them. Thus the references from the inside to the outside are all in control.
- Generating codes from AST is easy as well.
For instance, a function is as shown below:
function () {
a = 1;
window.b = 1;
eval('window.c()');
}
Static analysis based on JS code parser enables us to insert variable declaration statements before the original function body:
function () {
var a, window = {}, eval = function () {}; // variable overwriting
a = 1;
window.b = 1;
eval('window.c()');
}
That's it.
More overwritings should be considered, such as eval()
, new Function()
and other global objects or APIs. And warnings during parsing should be well organized and reported.
Some related work in order:
- esprima, ECMAScript parsing infrastructure for multipurpose analysis.
- estraverse, ECMAScript JS AST traversal functions.
- escope, ECMAScript scope analyzer.
- escodegen, ECMAScript code generator.
My practice based on the above is function-sandbox.