I'm generating your typical Web 2.0 HTML page with PHP: it contains a lot of <script>
tags and javascript code that will substantially change the DOM after the load event.
Is there a way to get the final HTML code directly from PHP, without opening the page with any browser?
For example, let's say the HTML for the page is (it's just an example):
<html>
<head>
<script>...the jquery library code...</script>
<script>$(document).ready(function() { $("body").append("<p>Hi!</p>");</script>
</head>
<body>
</body>
</html>
This HTML is saved in the $html
PHP variable. Now, I want to pass that variable to some function that will return $result = <html>....<body><p>Hi!</p></body></html>
.
Is this possible?
EDIT: since many of you were perplexed by my request I'll explain the reason. Unfortunately everything user facing was made in javascript and this makes the website uncrawlable by search engines. So I wanted to send them the post-ready event HTML code instead.
You have 2 problems:
To execute javascript you will need a javascript engine. There are currently 3 available for your use:
Once you have a javascript engine you will need to manage DOM (Document Object Model). This allows you to parse HTML into objects like DOM Nodes, Text Nodes, Elements, etc. On top of that you will need to sync your DOM with javascript engine and install DOM library in your javascript engine. Though there may be various ways to do this i prefer to simply include / evaluate a standalone JavaScript DOM into the engine and simply pass HTML to that.
Now that you have both a JavaScript Engine and DOM library, you can now evaluate most scripts without issue.
Best Answer
NodeJS, which comes as a standalone executable, has a javascript engine as well as DOM manipulation all in 1. On top of that you can also use it as a web server. Perhaps this is a better solution to your problem however if PHP is a must, stick to what is mentioned above.
I doubt that there are some good general purpose server-side runtimes for browser JavaScript generally and in PHP specifically. For complicated client scripts there is no such thing as "final DOM state". Imagine that some DOM-updating method is scheduled with
setTimeout
. Do you want to wait for it? And if it reschedules some other update in the same way (for example just to show current time somewhere on the page), how long are you going to wait? And what if page does some AJAX data downloading? Do you want to do actual server requests, emulate cookies, etc.? I think this is all too complicated to be implemented in a good way. Well, maybe Google has something like this in their crawler, but is it specialized for their particular needs.The best solution that I could find is to use HtmlUnit http://htmlunit.sourceforge.net/ on the server to execute your html with the javascript and get back the final html that the user would see on the browser.
The library has good support for JavaScript and is headless so you should be able to run it on the server.
You would need to write a small Java wrapper that could accept input via the command line and pass it onto HtmlUnit for processing and then return the result to you. You could then call this wrapper from PHP.
This question is very similar to how to execute javascript in javascript, or php in php, The answer is that you can eval it. If php could eval javascript and javascript could eval php, we would not have this discussion.
In order for JavaScript to eval PHP, it has to parse the PHP code into a structure that represents the script. JavaScript can easily do this with JavaScript object notation(not JSON format but the actual representation), and functionally breaking down the script.
Here's a naive example of JavaScript interpreting PHP(a more honest example would not be so contrived, but parse the php into its own JSON-like representation or possibly bytecode, then interpret this json-like representation or bytecode on a javascript emulation of the php virtual machine, but nonetheless):
The problem is that PHP is not javascript, and suffers from the fact that it's eval is much weaker than javascript's.
This creates a problem where php can make a request to tokenize javascript to PHP easily: JavaScript can easily create a "JSONified" version of anything(as long as it is not native), so you could have PHP send a request to a nodejs server with the script you want to evaluate.
for instance:(PHP code)
JavaScript can easily eval it to a "function object" by doing:
as you can see, js can easily parse it into an object, and back into a string, with a minor nuisance that '(' and ')' must be added in order to make it eval() without causing an error "Uncaught SyntaxError: Unexpected token (".
Regardless, a similar thing can be done in PHP:
Knowing this, you have to have JavaScript convert the JavaScript function object into a PHP function object, or go the easy route of just translating.
The problem lies in the fact that JavaScript(ES6) is much more expressive than PHP(5.6, 7 might be better but it doesn't work without service pack 1 windows 7 so I can't run it on this computer). This in turn means a lot of features JavaScript has, PHP does not have, for example:
Won't work on PHP 5.6 because it does not support self executing functions. This means you need to do more work to translate it into:
There are also issues in that PHP doesn't really use prototypes the way javascript does, so it's very hard to do translation of that.
Anyway, ultimately php and javascript are VERY similar, so much so that you can essentially use one in another with exceptions.
e.g: (PHP)
/* can't describe as a function as far as I know, since not prototypical */ class console { static function log($text) { echo $text . "\n"; } };
call_user_func(function() { $myScopeVariable = "Hey, this isn't JavaScript!"; console::log($myScopeVariable); });
e.g. JavaScript:
Conclusion
You can translate between PHP and JavaScript, but it is much easier to translate PHP to JavaScript than JavaScript to PHP because JavaScript is more expressive natively, whereas PHP has to create classes to represent many JavaScript constructs(funny enough, php can preprocess php to fix all of these problems).
Fortunately, PHP can now make sense of JSON natively, so after javascript evaluates itself, javascript can read the resulting structure(most things in JavaScript are objects or functions), including the source code, and put those objects into JSON-encoded form. After that, You can make PHP parse the JSON to recover the code via a neutral form).
e.g.
Essentially, communicating via a "Common LISP" so to speak. Of course this is going to be very expensive and not native, but it's fine to demonstrate an example. Ideally we would have a native module encapsulating scripting of all kind that could easily translate ruby to php to perl to python to javascript and then compile the result to c for the heck of it). javascript helps coming close to this by being able to eval itself AS WELL as print its own code. If all languages could do both of these things, it would be much easier to accomplish, but sadly javascript is only "almost there"(there is no un-eval function, you can easily invent it but it's not there yet)
As for updating DOM. PHP can do it as easily as JavaScript can. The problem is that both javascript and php have no idea what DOM is, it's just that in the browser, the dom is conveniently hooked as "window" object. You simply act as though the window is there and as the php gets evaluated to javascript, it will gain access to the DOM again. To make use of dom however, the code has to be "callback oriented" since it won't get dom until it is evaluated, but it's not bad, you just don't do anything until the evaluation is complete, and then perform the whole action at once after dom is available.
The code would look something like:
Although the proper way to do is is have the function evaluate to a promise(promises are universal... as soon as you implement them in all languages...). After that it just becomes a question of juggling promises/intents that are essentially language independent(To be specific, the intent is language independent, once the intent is translated, the intent will require dependencies that may or may not be provided to actually perform the sequence from start to finish).
Hopefully someday we will see a future where JavaScript can evaluate PHP and PHP can evaluate JavaScript seamlessly, at the very least to complete the confusion circle allowing us to write client side php and server side javascript(we're half way there!)
some ending thoughts
php, perl, lisp, and other lambda calculus synonyms need their own built in variant of JSON. It's basically eval and uneval but simpler in that it doesn't take care of the more exciting data structures like functions(which JavaScript can uneval using toString somewhat, and Perl can using Data::Dumper with Data::Dumper::Deparse set to 1).
every lambda calculus synonym language(php, perl, lisp, ..., where the statement
(function(a){return function(b){return a + b;}})(2)(3)
makes sense(naively even assembly can do this with stack digging, so its somewhat of a lambda calculus synonym language, and can also have its own variant of JSON) should be able to both encode a string of valid code into a common abstract representation that can be encoded to and decoded from any other lambda calculus synonym language.There are new servers that run Javascript server-side and are able to manipulate the DOM but it has nothing to do with PHP .
http://jaxer.org/
To evaluate JavaScript code using PHP, have a look at the V8 JavaScript engine extension, which you may compile into your PHP binary:
V8 is Google's open source JavaScript implementation.