I would really like to provide the user some scripting capabilities, while not giving it access to the more powerful features, like altering the DOM. That is, all input/output is tunneled thru a given interface. Like a kind of restricted javacsript.
Example:
If the interface is checkanswer(func)
this are allowed:
checkanswer( function (x,y)={
return x+y;
}
but these are not allowed:
alert(1)
document.write("hello world")
eval("alert()")
EDIT: what I had in mind was a simple language that was implemented using javascript, something like http://stevehanov.ca/blog/index.php?id=92
Not really. JavaScript is an extremely dynamic language with many hidden or browser-specific features that can be used to break out of any kind of jail you can devise.
Don't try to take this on yourself. Consider using an existing ‘mini-JS-like-language’ project such as Caja.
What about this pattern in order to implement a sandbox?
That returns:
Can you please provide an exploit suitable to attack this solution ? Just to understand and improve my knowledge, of course :)
THANKS!
I got another way: use google gears WorkerPool api
See this
http://code.google.com/apis/gears/api_workerpool.html
Sounds like you need to process the user entered data and replace invalid mark-up based on a white list or black-list of allowed content.
(Edit This answer relates to your pre-edit question. Don't know of any script languages implemented using Javascript, although I expect there are some. For instance, at one point someone wrote BASIC for Javascript (used to have a link, but it rotted). The remainder of this answer is therefore pretty academic, but I've left it just for discussion, illustration, and even cautionary purposes. Also, I definitely agree with bobince's points — don't do this yourself, use the work of others, such as Caja.)
If you allow any scripting in user-generated content, be ready for the fact you'll be entering an arms race of people finding holes in your protection mechanisms and exploiting them, and you responding to those exploits. I think I'd probably shy away from it, but you know your community and your options for dealing with abuse. So if you're prepared for that:
Because of the way that Javascript does symbol resolution, it seems like it should be possible to evaluate a script in a context where
window
,document
,ActiveXObject
,XMLHttpRequest
, and similar don't have their usual meanings:(Now that uses the evil
eval
, but I can't immediately think of a way to shadow the default objects cross-browser without usingeval
, and if you're receiving the code as text anyway...)But it doesn't work, it's only a partial solution (more below). The logic there is that any attempt within the code in
codeString
to accesswindow
(for instance) will access the local variablewindow
, not the global; and the same for the others. Unfortunately, because of the way symbols are resolved, any property ofwindow
can be accessed with or without thewindow.
prefix (alert
, for instance), so you have to list those too. This could be a long list, not least because as bobince points out, IE dumps any DOM element with a name or an ID ontowindow
. So you'd probably have to put all of this in its own iframe so you can do an end-run around that problem and "only" have to deal with the standard stuff. Also note how I made thescope
function a property of an object, and then you only call it through the property. That's so thatthis
is set to theScoper
instance (otherwise, on a raw function call,this
defaults towindow
!).But, as bobince points out, there are just so many different ways to get at things. For instance, this code in
codeString
successfully breaks the jail above:Now, maybe you could update the jail to make that specific exploit not work (mucking about with the
constructor
properties on all — all — of the built-in objects), but I tend to doubt it. And if you could, someone (like Bob) would just come up with a new exploit, like this one:Hence the "arms race."
The only really thorough way to do this would be to have a proper Javascript parser built into your site, parse their code and check for illegal accesses, and only then let the code run. It's a lot of work, but if your use-case justifies it...
T.J. Crowder makes an excellent point about the "arms race." It's going to be very tough to build a watertight sandbox.
it's possible to override certain functions, though, quite easily.
Simple functions:
And according to this question, even overriding things like
document.write
is as simple asif that works in the browsers you need to support (I assume it works in all of them), that may be the best solution.
Alternative options:
Sandboxing the script into an IFrame on a different subdomain. It would be possible to manipulate its own DOM and emit alert()s and such, but the surrounding site would remain untouched. You may have to do this anyway, no matter which method(s) you choose
Parsing the user's code using a white list of allowed functions. Awfully complex to do as well, because there are so many notations and variations to take care of.
There are several methods to monitor the DOM for changes, and I'm pretty sure it's possible to build a mechanism that reverts any changes immediately, quite similar to Windows's DLL management. But it's going to be awfully complex to build and very resource-intensive.