Ajax Url accessed by Google

2019-05-15 04:09发布

问题:

We’re having an issue with Googlebot trying to access a URL on an Ajax function and failing due to some URL encode issue. First of all we’re bit confused why googlebot is trying to access a URL inside a JS function on a JS script.

JS code:

 ajaxFunction(siteid) {
   $.get(location.protocol + '//' + location.hostname + '/ajax/?ajaxscript=detail&siteid='+ siteid, function() { ... });
}

Above function is in a JS script included on our web page which gets called when a link/button is clicked. Googlebot somehow trying to go to the URL generated by the above function directly and getting errors due to “?” character being URL encoded so the siteid value not getting passed.

Example URL that google is trying to access:

 http://www.google.com/url?sa=t&rct=j&q=duo%2Bboots&source=web&cd=4&ved=0CDQQFjAD&url=http%3A%2F%2Fwww.MYSITE.com%2Fajax%2F%253Fajaxscript%3Ddetail%26siteid%3D1 

Do you have any idea why googlebot is trying directly access the URL generated by the JS function and is it possible for googlebot to access ajax based functions and URLs directly? Basically the primary problem is that the ? is getting converted to %2F which is therefore not passing the required data to my script, and this is getting logged as an error in our server error log.

回答1:

Google is getting very curious about these JavaScript redirects, he knows these urls with a full page rendering (including JS), Google Toolbar data or Chrome data.

I always use a prefix to all my AJAX request, e.g. http://domain.com/_ajax/xxxxx, then I forbid all bots to crawl urls starting with /_ajax/ with robots.txt

You could also add a "noindex,nofollow" to the X-Robots-Tag HTTP header.



回答2:

Matt Cutts said a while ago that "Googlebot keeps getting smarter", see also this blog entry and there is even a blogpost on SEOmoz back in 2008.

Googlebot tries to do what your users do and see content so far unreachable. Failing to do so included.

If it's not possible for you to change the arguments before you might be able to parse the request on the serverside with the double encoding in mind?



标签: ajax url seo