HTML rendering on server with Node.js

2020-07-13 07:39发布

问题:

Suppose I've got a web page, which contains nothing but a javascript reference. When a browser loads the page it runs the javascript, which does the actual rendering. The javascript is large, complex, and makes a lot of XHR calls.

Now I need to make this page searchable, i.e. render the page on the server.

I tried to load the page in phantomJS but the it was slow and sometimes did not complete the whole page. So I'm wondering if there is an alternative.

Ideally I need a node.js script to

  • load a web page by URL
  • run the page javascript and then
  • serialize the DOM created by the javascript to HTML.

P.S. I can assume that the javascript is based on React.js

回答1:

Essentially you need to configure a node.js server that for every request can respond the react component rendering result as plain string. The key is React.renderToString. An example in combination with react-router:

import express from "express";  
import React from "react";  
import Router from "react-router";  
const app = express();

// set up Jade
app.set('views', './views');  
app.set('view engine', 'jade');

import routes from "../shared/routes";

app.get('/*', function (req, res) {  
  Router.run(routes, req.url, Handler => {
    let content = React.renderToString(<Handler />);
    res.render('index', { content: content });
  });
});

var server = app.listen(3000, function () {  
  var host = server.address().address;
  var port = server.address().port;

  console.log('Example app listening at http://%s:%s', host, port);
});

React-router is helpful to load components based on url but it is not strictly necessary. Anyway if you are new to react ecosystem I suggest you to take a look at this starter kit for isomorphic react applications. As you should know what you are trying to do is called isomorphic Javascript.



回答2:

If you are going to hack things, why don't you do this:

  1. Use a light-weight node proxy module.
  2. Inject a small javascript file into the page that is served to the client. You can use harmon for that (https://github.com/No9/harmon).
  3. In that javascript file wait until the page is loaded, then post the rendered HTML back to your server.
  4. On the server, check if you already have that page. If you don't, then store it.

You can make a decision about when and how you serve the "frozen" versions of pages versus the dynamic ones.

Note that this makes your React pages static rather than dynamic - but they are searchable. Maybe you want a searchable archive alongside the dynamically rendered app-like pages. This would allow you to do this. It off-loads the rendering to clients.

There may be issues around logins and confidential information, if for example this were a GMail-type app.

But I didn't read anything in your question that suggests it.



回答3:

I think that PhantomJS and good caching are your best hope by far, outside of doing a proper server-renderable architecture (which would be the actual right thing to do). Trying to emulate a browser in node is a fool's errand. You will never complete it and will constantly be finding "oops I forgot about that one other thing" endlessly.

Many of your peers in industry are faced with this same problem. Don't cobble together some bespoke solution. Either make node rendering first-class by explicit ReactDOM.renderToString() and factoring out the browser-side code (XHRs etc), or use a fully-capable headless browser like PhantomJS.