I've been looking for ways to do this for a while but haven't quite been able to find the right way to do it.
The task: Execute Javascript from a Linux command line.
For example, have the binary or whatever is going to interpret Javascript load up some .js files, then print a value of some variable.
More concrete example: I would like to get the final version of this page after Javascript has been interpreted and executed http://www.vureel.com/video/2809/American-Dad. If you look at the page with Firebug, you will see that this obscure Javascript
<script language="JavaScript" type="text/javascript">/*<![CDATA[*/var a,s,n;function a8bcb4f34dfd6e81cfdb9c115d1671582(s){r="";for(i=0;i<s.length;i++){n=s.charCodeAt(i);if(n<128){n=n ... etc ...</script>
turned into a nice embed code
<embed height="390" width="642" flashvars="file=http://vureel-cdn-2.vureel.com/leechingisillegal/537c69afbcaf4c7cf416f30077bbe9d1/4a29621d/here/2809.flv ...etc .../>
This is just an example but hopefully you see what I'm driving at.
Take a look at the Rhino engine (Rhino on wikipedia)
Here are some alternative:
- Mozilla SpiderMonkey
- JSDB
- wxJavaScript
You may also want to take a look at Node.js
Your sort of driving at two different points 1) executing javascript outside the browser 2) viewing results of javascript on a web page.
For the first problem, mozilla rhino is a javascript interpreter that runs in java. You can execute javascript through a command line.
For the second problem, look at the dom tab in Firebug, you can see the resulting document elements after the javascript has run.
Or you could enable script debugging, save a local copy of the page and insert it a debug(); statement.
I think you want to do some scraping while executing javascript. env.js described in http://ejohn.org/blog/bringing-the-browser-to-the-server/ might be helpful. I was meant to try it on some tool of mine but couldn't for the lack of time and settled with site specific scripts.
Take a look at http://phantomjs.org/
It's a headless web browser, so, you get to construct the dom and manipulate it like you would in a real browser. Obviously you could export the result.
If you like Python, you can grab ghost.py from GitHub, which allows you to create a headless, WebKit browser and control it from within your Python script. I've used this interactively through the IPython Notebook and it worked pretty well out the box. I extended it to work with BeautifulSoup, and it was nice.