I guess what I mean is, if I make a site that uses AJAX to load some content that I also want search engines to find -- if I make the page work sans-javascript (say, when javascript isn't present, a link goes to site.com?c=somecontent
rather than calling a function with $("#content").load("somecontent.html");
), will the search engine follow the non-javascript link and be able to index the site well?
I suppose this would work if javascript-enabled browsers who followed a search engine link to the ?c=somecontent
link would still make use of the site normally, right?
Is this a really challenging endeavor or can it be done relatively easily if the site is structured properly?
In addition to previous answers.
You can check if request is via AJAX an display results according to it (i.e. JSON data if AJAX, full html page if not). In PHP checking script looks like
Sure there will be solution for technology you're using in server side (Ruby/Python/ASP).
If your link looks something like this:
It will gracefully degrade, and the search engines will be happy.
One issue you do have is when someone who has Javascript enabled navigates your site, the location in the address bar wont change. This can be overcome by having your onclick function set the
window.location.hash
to update the hash part of the url based on the page the visitor is on. Then, on loading each full page, a script should check if the hash matches the current page, and if not, switch to the page indicated by the hash using the AJAX call.best way to do this is to build the site without any ajax and apply seo practices. then when the non-ajax version works, add a class called "dynamic-link" or something to the links you want to be ajax-link. and handle the click events on those links. so when bots crawl the site, they have something to fall back on.
Just to add on: If you have noticed the way Facebook works - the AJAX works if Javascript is enabled, and if Javascript is disabled (e.g. robots) it will still work.
Do something like this in your links (e.g. paging):
Thus the key thing is to design in a way that AJAX works in modern browsers, yet the pages still work when you have Javascript disabled.
Take a look to the Single Page Interface Manifesto the hard part is to serve "HTML snapshots" to the crawler bot.
That can be problematic, as you'll end up with your little fragment of a page getting into the index.
If you can get away with it, it's a lot easier to "fake" ajax with plain old DHTML.
Now you've got all your content visible to search engines, all at the same (and correct) URI, but for users with Javascript enabled, it acts like the AJAX solution (but faster, after the initial load, which is a bit slower).
Now, if you've got a TON of content, you'll have to think of something smarter. But if all your content weighs relatively little (don't forget to make sure your server gzip-encodes content), this setup is preferable.
Otherwise, if you've got hundreds of kilobytes (or more) of text content, you'll have to get craftier. Your initial thought is pretty right on.
Will generally do what you think it will. The problem is, of course, that '/some/page.html' will end up in search indicies. You could try doing tricky server-side things (like checking referers and redirecting back to the "main" page), but I'm not convinced that's not a pandora's box.
Hopefully someone will come along with another answer that addresses this. If I think of something, I'll edit this one.