I have my blog (you can see it if you want, from my profile), and it's fresh, as well as google robots parsing results are.
The results were alarming to me. Apparently the most common 2 words on my site are "rss" and "feed", because I use text for links like "Comments RSS", "Post Feed", etc. These 2 words will be present in every post, while other words will be more rare.
Is there a way to make these links disappear from Google's parsing? I don't want technical links getting indexed. I only want content, titles, descriptions to get indexed. I am looking for something other than replacing this text with images.
I found some old discussions on Google, back from 2007 (I think in 3 years many things could have changed, hopefully this too)
This question is not about robots.txt and how to make Google ignore pages. It is about making it ignore small parts of the page, or transforming the parts in such a way that it will be seen by humans and invisible to robots.
The only control that you have over the indexing robots, is the robots.txt file. See this documentation, linked by Google on their page explaining the usage of the file.
You basically can prohibit certain links and URL's but not necessarily keywords.
you have to manually detect the "Google Bot" from request's user agent and feed them little different content than you normally serve to your user.
There is a simple way to tell google to not index parts of your documents, that is using
googleon
andgoogleoff
:In this example, the second paragraph will not be indexed by Google. Notice the “
index
” parameter, which may be set to any of the following:index
— content surrounded by “googleoff: index
” will not be indexed by Googleanchor
— anchor text for any links within a “googleoff: anchor
” area will not be associated with the target pagesnippet
— content surrounded by “googleoff: snippet
” will not be used to create snippets for search resultsall
— content surrounded by “googleoff: all
” are treated with allsource
No, there really isn't anything like that. There are various server-side techniques, but if Google catches you serving up different text to its bot than you give to website visitors it will penalize you.
I work on a site with top-3 google ranking for thousands of school names in the US, and we do a lot of work to protect our SEO. There are 3 main things you could do (which are all probably a waste of time, keep reading):
That said, crawlers are smart, and you're not the only site filled with permalink and rss links. They care about context, and look for terms and phrases in your headings and body text. They know how to determine that your blog is about technology and not RSS. I highly doubt those links have any negative effect on your SEO. What problem are you actually trying to solve?
If you want to build SEO, figure out what value you provide to readers and write about that. Say interesting things that will lead others to link to your blog, and crawlers will understand that you're an information source that people value. Think more about what your readers see and understand, and less about what you think a crawler sees.
Google crawler are smart but someone that program them are smartest. Human always sees what is sensible in the page, they will spend time on blog that have some nice content and most rare and unique. It is all about common sense, how people visit your blog and how much time they spend. Google measure the search result in the same way. Your page ranking also increase as daily visits increase and site content get better and update every day. This page has "Answer" words repeated multiple times. It doesn't mean that it will not get indexed. It is how much useful is to every one. I hope it will give you some idea