Does html5mode(true) affect google search crawlers

2019-02-10 05:20发布

问题:

I'm reading this specification which is an agreement between web servers and search engine crawlers that allows for dynamically created content to be visible to crawlers. It's stated there that in order for a crawler to index html5 application one must implement routing using #! in URLs. In angular html5mode(true) we get rid of this hashed part of the URL. I'm wondering whether this is going to prevent crawlers from indexing my website.

回答1:

Short answer - No, html5mode will not mess up your indexing, but read on.


Important note: Both Google and Bing can crawl AJAX based content without HTML snapshots

I know, the documentation you link to says otherwise but about a year or two ago they officially announced that they handle AJAX content without the need for HTML snapshots, as long as you use pushstates, but a lot of the documentation is old and unfortunately not updated.

SEO using pushstates

The requirement for AJAX crawling to work out of the box is that you are changing your url using pushstates. This is just what html5mode in Angular does (and also what a lot of other frameworks do). When pushstates is on the crawlers will wait for ajax calls to finish and for javascript to update the page before they index it. You can even update things like page-title or even meta tags in your router and it will index properly. In essence you don't need to do anything, there is no difference between server-side and client-side rendered sites in this case.

To be clear, a lot of SEO-analysis tools (such as Moz) will spit out warnings on pages using pushstates. That's because those tools (and their reps if you talk to them) are at the time of writing not up to date, so ignore them.

Finally, make sure you are not using the fragment meta-tag from below when doing this. If you have that tag the crawlers will think that you want to use the non-pushstates method and things might get messed up.

SEO without pushstates

There is very little reason not to use pushstates with Angular, but if you don't you need to follow the guidelines linked to in the question. In short you create snapshots of the html on your server and then you use the fragment meta tag to change your url-fragment to be "#!" instead of "#".

<meta name="fragment" content="!" />

When a crawler finds a page like this it will remove the fragment part of the url and instead requests the url with the parameter _escaped_fragment_, and you can serve your snapshotted page in response. Giving the crawler a normal static page to index.

Note that the fragment meta-tag should only be used if you want to trigger this behaviour. If you are using pushstates and want the page to index that way, don't use this tag.

Also, when using snapshots in Angular you can have html5mode on. In html5mode the fragment is hidden but it is still technically exists and will still trigger the same behaviour, assuming the fragment meta-tag is set.

A warning - Facebook crawler

While both Google and Bing will crawl your AJAX pages without problem (if you are using pushstates), Facebook will not. Facebook does not understand ajax-content and still requires special solutions, like html snapshots served specifically to the facebook bot (user agent facebookexternalhit/1.1).


Edit - I should probably mention that I have deployed sites with all of these versions. Both with html5mode, fragment meta tag and snapshots and without any snapshots and just relying on the pushstate-crawling. It all works fine, except for pushstates and Facebook as noted above.



回答2:

To allow indexing of your AJAX application, you have to add special meta tag in the head section of your document:

<meta name="fragment" content="!" />

Source: https://docs.angularjs.org/guide/$location#crawling-your-app

Towards the bottom look for crawling your app