Is there a way to make robots ignore certain text?

2019-01-17 14:52发布

I have my blog (you can see it if you want, from my profile), and it's fresh, as well as google robots parsing results are.

The results were alarming to me. Apparently the most common 2 words on my site are "rss" and "feed", because I use text for links like "Comments RSS", "Post Feed", etc. These 2 words will be present in every post, while other words will be more rare.

Is there a way to make these links disappear from Google's parsing? I don't want technical links getting indexed. I only want content, titles, descriptions to get indexed. I am looking for something other than replacing this text with images.

I found some old discussions on Google, back from 2007 (I think in 3 years many things could have changed, hopefully this too)

This question is not about robots.txt and how to make Google ignore pages. It is about making it ignore small parts of the page, or transforming the parts in such a way that it will be seen by humans and invisible to robots.

8条回答
forever°为你锁心
2楼-- · 2019-01-17 15:53

Firstly think about the issue. If Google think "RSS" is the main keyword that may suggest the rest of your content is a bit shallow and needs expanding. Perhaps this should be the focus of your attention.If the rest of your content is rich I wouldn't worry about the issue as a search engine should know what the page is about from title and headings. Just make sure RSS etc is not in a heading or bold or strong tag.

Secondly as you rightly mention, you probably don't want use images as they are not assessable to screen readers without alt text and if they have alt text or supporting text then you add the keyword back in. However aria live may help you get around this issue, but I'm not an expert on accessibility.

Options:

  • Use JavaScript to write that bit of content (maybe ajax it in after load). Search engines like Google can execute JavaScript but I would guess it wont value any JS written content very highly.
  • Re-word the content or remove duplicates of it, one prominent RSS feed link may be better than several smaller ones dotted around the page.
  • Use the css content attribute with pseudo :before or :after to add your content. I'm not sure if bots will index words in content attributes in CSS and know that contents value in relation to each page but it seems unlikely. Putting words like RSS in the CSS basically says it's a style thing not an HTML thing, therefore even if engines to index it they wont add much/any value to it. For example, the HTML and CSS could be:

    <a href="/my-feed.rss" class="add-text"></a>
    
    .add-text:after { content:'View my RSS feed'; }
    

Note the above will not work in older versions of IE, so you may need some IE version comments if you care about that.

查看更多
Melony?
3楼-- · 2019-01-17 15:56

Other than black-hat server-side methods, there is nothing you can do. You may want to look at why you have those words so often and remove some of them from the site.

It used to be that you could use JS to "hide" things from googlebot, but you can't now that it parses JS. ( http://www.webmasterworld.com/google/4159807.htm )

查看更多
登录 后发表回答