I would really like to provide the user some scripting capabilities, while not giving it access to the more powerful features, like altering the DOM. That is, all input/output is tunneled thru a given interface. Like a kind of restricted javacsript.
Example:
If the interface is checkanswer(func)
this are allowed:
checkanswer( function (x,y)={
return x+y;
}
but these are not allowed:
alert(1)
document.write("hello world")
eval("alert()")
EDIT: what I had in mind was a simple language that was implemented using javascript, something like http://stevehanov.ca/blog/index.php?id=92
(Edit This answer relates to your pre-edit question. Don't know of any script languages implemented using Javascript, although I expect there are some. For instance, at one point someone wrote BASIC for Javascript (used to have a link, but it rotted). The remainder of this answer is therefore pretty academic, but I've left it just for discussion, illustration, and even cautionary purposes. Also, I definitely agree with bobince's points — don't do this yourself, use the work of others, such as Caja.)
If you allow any scripting in user-generated content, be ready for the fact you'll be entering an arms race of people finding holes in your protection mechanisms and exploiting them, and you responding to those exploits. I think I'd probably shy away from it, but you know your community and your options for dealing with abuse. So if you're prepared for that:
Because of the way that Javascript does symbol resolution, it seems like it should be possible to evaluate a script in a context where window
, document
, ActiveXObject
, XMLHttpRequest
, and similar don't have their usual meanings:
// Define the scoper
var Scoper = (function() {
var rv = {};
rv.scope = function(codeString) {
var window,
document,
ActiveXObject,
XMLHttpRequest,
alert,
setTimeout,
setInterval,
clearTimeout,
clearInterval,
Function,
arguments;
// etc., etc., etc.
// Just declaring `arguments` doesn't work (which makes
// sense, actually), but overwriting it does
arguments = undefined;
// Execute the code; still probably pretty unsafe!
eval(codeString);
};
return rv;;
})();
// Usage:
Scoper.scope(codeString);
(Now that uses the evil eval
, but I can't immediately think of a way to shadow the default objects cross-browser without using eval
, and if you're receiving the code as text anyway...)
But it doesn't work, it's only a partial solution (more below). The logic there is that any attempt within the code in codeString
to access window
(for instance) will access the local variable window
, not the global; and the same for the others. Unfortunately, because of the way symbols are resolved, any property of window
can be accessed with or without the window.
prefix (alert
, for instance), so you have to list those too. This could be a long list, not least because as bobince points out, IE dumps any DOM element with a name or an ID onto window
. So you'd probably have to put all of this in its own iframe so you can do an end-run around that problem and "only" have to deal with the standard stuff. Also note how I made the scope
function a property of an object, and then you only call it through the property. That's so that this
is set to the Scoper
instance (otherwise, on a raw function call, this
defaults to window
!).
But, as bobince points out, there are just so many different ways to get at things. For instance, this code in codeString
successfully breaks the jail above:
(new ('hello'.constructor.constructor)('alert("hello from global");'))()
Now, maybe you could update the jail to make that specific exploit not work (mucking about with the constructor
properties on all — all — of the built-in objects), but I tend to doubt it. And if you could, someone (like Bob) would just come up with a new exploit, like this one:
(function(){return this;})().alert("hello again from global!");
Hence the "arms race."
The only really thorough way to do this would be to have a proper Javascript parser built into your site, parse their code and check for illegal accesses, and only then let the code run. It's a lot of work, but if your use-case justifies it...
T.J. Crowder makes an excellent point about the "arms race." It's going to be very tough to build a watertight sandbox.
it's possible to override certain functions, though, quite easily.
Simple functions:
- JavaScript: Overriding alert()
And according to this question, even overriding things like document.write
is as simple as
document.write = function(str) {}
if that works in the browsers you need to support (I assume it works in all of them), that may be the best solution.
Alternative options:
Sandboxing the script into an IFrame on a different subdomain. It would be possible to manipulate its own DOM and emit alert()s and such, but the surrounding site would remain untouched. You may have to do this anyway, no matter which method(s) you choose
Parsing the user's code using a white list of allowed functions. Awfully complex to do as well, because there are so many notations and variations to take care of.
There are several methods to monitor the DOM for changes, and I'm pretty sure it's possible to build a mechanism that reverts any changes immediately, quite similar to Windows's DLL management. But it's going to be awfully complex to build and very resource-intensive.
Not really. JavaScript is an extremely dynamic language with many hidden or browser-specific features that can be used to break out of any kind of jail you can devise.
Don't try to take this on yourself. Consider using an existing ‘mini-JS-like-language’ project such as Caja.
Sounds like you need to process the user entered data and replace invalid mark-up based on a white list or black-list of allowed content.
You can do it the same way as Facebook did. They're preprocessing all the javascript sources, adding a prefix to all the names other than their own wrapper APIs'.
I got another way: use google gears WorkerPool api
See this
http://code.google.com/apis/gears/api_workerpool.html
A created worker does not have access
to the DOM; objects like document and
window exist only on the main page.
This is a consequence of workers not
sharing any execution state. However,
workers do have access to all
JavaScript built-in functions. Most
Gears methods can also be used,
through a global variable that is
automatically defined:
google.gears.factory. (One exception
is the LocalServer file submitter,
which requires the DOM.) For other
functionality, created workers can ask
the main page to carry out requests.
What about this pattern in order to implement a sandbox?
function safe(code,args)
{
if (!args)
args=[];
return (function(){
for (i in window)
eval("var "+i+";");
return function(){return eval(code);}.apply(0,args);
})();
}
ff=function()
{
return 3.14;
}
console.log(safe("this;"));//Number
console.log(safe("window;"));//undefined
console.log(safe("console;"));//undefined
console.log(safe("Math;"));//MathConstructor
console.log(safe("JSON;"));//JSON
console.log(safe("Element;"));//undefined
console.log(safe("document;"));//undefined
console.log(safe("Math.cos(arguments[0]);",[3.14]));//-0.9999987317275395
console.log(safe("arguments[0]();",[ff]));//3.14
That returns:
Number
undefined
undefined
MathConstructor
JSON
undefined
undefined
-0.9999987317275395
3.14
Can you please provide an exploit suitable to attack this solution ? Just to understand and improve my knowledge, of course :)
THANKS!
This is now easily possible with sandboxed IFrames:
var codeFunction = function(x, y) {
alert("Malicious code!");
return x + y;
}
var iframe = document.createElement("iframe");
iframe.sandbox = "allow-scripts";
iframe.style.display = "none";
iframe.src = `data:text/html,
<script>
var customFunction = ${codeFunction.toString()};
window.onmessage = function(e) {
parent.postMessage(customFunction(e.data.x, e.data.y), '*'); // Get arguments from input object
}
</script>`;
document.body.appendChild(iframe);
iframe.onload = function() {
iframe.contentWindow.postMessage({ // Input object
x: 5,
y: 6
}, "*");
}
window.onmessage = function(e) {
console.log(e.data); // 11
document.body.removeChild(iframe);
}