Do search engines respect the HTTP header field “C

2020-06-01 10:15发布

I was wondering whether search engines respect the HTTP header field Content-Location.

This could be useful, for example, when you want to remove the session ID argument out of the URL:

GET /foo/bar?sid=0123456789 HTTP/1.1
Host: example.com
…

HTTP/1.1 200 OK
Content-Location: http://example.com/foo/bar
…

Clarification:
I don’t want to redirect the request, as removing the session ID would lead to a completely different request and thus probably also a different response. I just want to state that the enclosed response is also available under its “main URL”.

Maybe my example was not a good representation of the intent of my question. So please take a look at What is the purpose of the HTTP header field “Content-Location”?.

5条回答
地球回转人心会变
2楼-- · 2020-06-01 10:54

Try the "Location:" header instead.

查看更多
The star\"
3楼-- · 2020-06-01 10:55

In 2009 Google started looking at URIs qualified as rel=canonical in the response body.

Looks like since 2011, links formatted as per RFC5988 are also parsed from the header field Link:. It is also clearly mentioned in the Webmaster Tools FAQ as a valid option.

Guess this is the most up-to-date way of providing search engines some extra hypermedia breadcrumbs to follow - thus allow keeping you to keep them out of the response body when you don't actually need to serve it as content.

查看更多
We Are One
4楼-- · 2020-06-01 11:02

Most decent crawlers do follow Content-Location. So, yes, search engines respect the Content-Location header, although that is no guarantee that the URL having the sid parameter will not be on the results page.

查看更多
Juvenile、少年°
5楼-- · 2020-06-01 11:07

I think Google just announced the answer to my question: the canonical link relation for declaring the canonical URL.

Maile Ohye from Google wrote:

MickeyC said...
You should have used the Content-Location header instead, as per:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html
"14.14 Content-Location"

@MikeyC: Yes, from a theoretical standpoint that makes sense and we certainly considered it. A few points, however, led us to choose :

  1. Our data showed that the "Content-Location" header is configured improperly on many web sites. Sometimes webmasters provide long, ugly URLs that aren’t even duplicates -- it's probably unintentional. They're likely unaware that their webserver is even sending the Content-Location header.

    It would've been extremely time consuming to contact site owners to clean up the Content-Location issues throughout the web. We realized that if we started with a clean slate, we could provide the functionality more quickly. With Microsoft and Yahoo! on-board to support this format, webmasters need to only learn one syntax.

  2. Often webmasters have difficulty configuring their web server headers, but can more easily change their HTML. rel="canonical" seemed like a friendly attribute.

http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html?showComment=1234714860000#c8376597054104610625

查看更多
Deceive 欺骗
6楼-- · 2020-06-01 11:20

In addition to using 'Location' rather than 'Content-Location' use the proper HTTP status code in your response depending on your reason for redirect. Search engines tend to favor permanent redirect (301) status vs temporary (302) status.

查看更多
登录 后发表回答