There are a lot of cool tools for making powerful "single-page" JavaScript websites nowadays. In my opinion, this is done right by letting the server act as an API (and nothing more) and letting the client handle all of the HTML generation stuff. The problem with this "pattern" is the lack of search engine support. I can think of two solutions:
- When the user enters the website, let the server render the page exactly as the client would upon navigation. So if I go to
http://example.com/my_path
directly the server would render the same thing as the client would if I go to/my_path
through pushState. - Let the server provide a special website only for the search engine bots. If a normal user visits
http://example.com/my_path
the server should give him a JavaScript heavy version of the website. But if the Google bot visits, the server should give it some minimal HTML with the content I want Google to index.
The first solution is discussed further here. I have been working on a website doing this and it's not a very nice experience. It's not DRY and in my case I had to use two different template engines for the client and the server.
I think I have seen the second solution for some good ol' Flash websites. I like this approach much more than the first one and with the right tool on the server it could be done quite painlessly.
So what I'm really wondering is the following:
- Can you think of any better solution?
- What are the disadvantages with the second solution? If Google in some way finds out that I'm not serving the exact same content for the Google bot as a regular user, would I then be punished in the search results?
So, it seem that the main concern is being DRY
<a href="/someotherpage">mylink</a>
, the server rewrites url to your application file, loads it in phantom.js and the resulting html is sent to the bot, and so on...<a>
tags. In this case handling 404 is easier since you can simply check for the existence of the static file with a name that contains url path.Here are a couple of examples using phantom.js for seo:
http://backbonetutorials.com/seo-for-single-page-apps/
http://thedigitalself.com/blog/seo-and-javascript-with-phantomjs-server-side-rendering
Interesting. I have been searching around for viable solutions but it seems to be quite problematic.
I was actually leaning more towards your 2nd approach:
Here's my take on solving the problem. Although it is not confirmed to work, it might provide some insight or idea's for other developers.
Assume you're using a JS framework that supports "push state" functionality, and your backend framework is Ruby on Rails. You have a simple blog site and you would like search engines to index all your article
index
andshow
pages.Let's say you have your routes set up like this:
Ensure that every server-side controller renders the same template that your client-side framework requires to run (html/css/javascript/etc). If none of the controllers are matched in the request (in this example we only have a RESTful set of actions for the
ArticlesController
), then just match anything else and just render the template and let the client-side framework handle the routing. The only difference between hitting a controller and hitting the wildcard matcher would be the ability to render content based on the URL that was requested to JavaScript-disabled devices.From what I understand it is a bad idea to render content that isn't visible to browsers. So when Google indexes it, people go through Google to visit a given page and there isn't any content, then you're probably going to be penalised. What comes to mind is that you render content in a
div
node that youdisplay: none
in CSS.However, I'm pretty sure it doesn't matter if you simply do this:
And then using JavaScript, which doesn't get run when a JavaScript-disabled device opens the page:
This way, for Google, and for anyone with JavaScript-disabled devices, they would see the raw/static content. So the content is physically there and is visible to anyone with JavaScript-disabled devices.
But, when a user visits the same page and actually has JavaScript enabled, the
#no-js
node will be removed so it doesn't clutter up your application. Then your client-side framework will handle the request through it's router and display what a user should see when JavaScript is enabled.I think this might be a valid and fairly easy technique to use. Although that might depend on the complexity of your website/application.
Though, please correct me if it isn't. Just thought I'd share my thoughts.
I think you need this: http://code.google.com/web/ajaxcrawling/
You can also install a special backend that "renders" your page by running javascript on the server, and then serves that to google.
Combine both things and you have a solution without programming things twice. (As long as your app is fully controllable via anchor fragments.)
Use NodeJS on the serverside, browserify your clientside code and route each http-request's(except for static http resources) uri through a serverside client to provide the first 'bootsnap'(a snapshot of the page it's state). Use something like jsdom to handle jquery dom-ops on the server. After the bootsnap returned, setup the websocket connection. Probably best to differentiate between a websocket client and a serverside client by making some kind of a wrapper connection on the clientside(serverside client can directly communicate with the server). I've been working on something like this: https://github.com/jvanveen/rnet/
To take a slightly different angle, your second solution would be the correct one in terms of accessibility...you would be providing alternative content to users who cannot use javascript (those with screen readers, etc.).
This would automatically add the benefits of SEO and, in my opinion, would not be seen as a 'naughty' technique by Google.
If you're using Rails, try poirot. It's a gem that makes it dead simple to reuse mustache or handlebars templates client and server side.
Create a file in your views like
_some_thingy.html.mustache
.Render server side:
Put the template your head for client side use:
Rendre client side: