-->

Ajax generated content, crawling and black listing

2019-08-18 07:45发布

问题:

My website uses ajax.

I've got a user list page which list users in an ajax table (with paging and more information stuff...).

The url of this page is : /user-list

User list is created by ajax. When the user click on one user, he is redirected to a page which url is : /member/memberName

So we can see here that ajax is used to generate content and not to manage navigation (with the # character).

I want to detect bot to index all pages.

So, in ajax I want to display an ajax table with paging and cool ajax effetcs (more info...) and when I detect a bot I want to display all users (without paging) with a link to the member page like this :

<a href="/member/john">John</a><a href="/member/bob">Bob</a>...

Do you think I can be black listed with this technique ? If you think so, could you please provide an alternative solution by keeping these clean urls and without redeveloping the user-list (without ajax) ?

回答1:

Google support a specification to make AJAX crawlable:

http://code.google.com/web/ajaxcrawling/docs/specification.html

I did an experiment and it works:

http://seo-website-designer.com/SEO-Ajax-Google-Solution

As this is a Google specification, you won't get penalised (unless you abuse it).

Saying that, only Google support it at the moment (AFAIK).

Also, I believe following the concept of Progressive Enhancement is a better approach. That is, create a working html website then make the JavaScript enhance it



回答2:

Maybe use the <a href=""></a> urls with an onclick to trigger your AJAX scripting? Like

<a href="/some/url" onclick="YourFancyFunction();return false;">Some URL</a>

I don't think Google would punish you for this, you primarily use JScript, but you do provide a fall back for their bot, so your site doesn't get any less accessible.

EDIT
Ok, I misunderstood. Then my guess would be you basically have two options:
1. Write a different part of your site where bots end up, or, 2. Rewrite your current site to for example always give a 'full' page, with an option to only get, say, the content div. Then you can get only the content with JavaScript, but bots will always get a nice page.