What is http host header?

2019-01-16 08:34发布

问题:

Given that the TCP connection is already established when the HTTP request is sent, the IP address and port are implicitly known -- a TCP connection is an IP + Port. So, why do we need the Host header? Is this only needed for the case where there are multiple hosts mapped to the IP address implied in the TCP connection?

回答1:

The host-Header tells the webserver which virtual host to use (if set up). You even can have the same virtual host using several aliases (= domains and wildcard-domains). In this case, you still have the possibility to read that header manually in your web app if you want to provide different behavior based on different domains addressed. This is possible because in your webserver you can (and if I'm not mistaken you must) set up one vhost to be the default host. This default vhost is used whenever the host-header does not match any of the configured virtual hosts.

That means: You get it right, although saying "multiple hosts" may be somewhat misleading: The host (the addressed machine) is the same, what really gets resolved to the IP address are different domain names (including subdomains) that are also referred to as hostnames (but not hosts!).

Although not part of the question, a fun fact: This specification lead to problems with SSL in early days because the web server has to deliver the certificate that corresponds to the domain the client has addressed. However, in order to know what certificate to use, the webserver should have known the addressed hostname in advance. But because the client sends that information only over the encrypted channel (which means: after the certificate has already been sent), the server had to assume you browsed the default host. That meant: One ssl-secured domain per IP address / port-combination.

This has been overcome with Server Name Indication, however, that again breaks some privacy, as the server name is now transferred in plain text again, so every man-in-the-middle would see, which hostname you are trying to connect to.

Although the webserver would know the hostname from Server Name Indication, the host-header is not obsolete, because the Server Name Indication information is only used within the TLS handshake. With an unsecured connection, there is no Server Name Indication at all, so the host-header is still valid (and necessary).

Another fun fact: Most webservers (if not even all of them) reject your http-request if it does not contain exactly one host-header, even if it could be omitted because there is only the default vhost configured. That means, the minimum required information in an http-(get-)request is the first line containing METHOD RESOURCE and PROTOCOL VERSION and at least the host-header, like this:

GET /someresource.html HTTP/1.1
Host: www.example.com

You may want to read the MDN Documentation on the Host-Header for more information, which says

A Host header field must be sent in all HTTP/1.1 request messages. A 400 (Bad Request) status code will be sent to any HTTP/1.1 request message that lacks a Host header field or contains more than one.

As mentioned by Darrel Miller, the complete specs can be found in RFC7230.



回答2:

I would always recommend going to the authoritative source when trying to understand the meaning and purpose of HTTP headers.

The "Host" header field in a request provides the host and port
information from the target URI, enabling the origin server to
distinguish among resources while servicing requests for multiple
host names on a single IP address.

https://tools.ietf.org/html/rfc7230#section-5.4



回答3:

HTTP 1.1, a host header is a third piece of information that you can use in addition to the IP address and port number to uniquely identify a Web domain or, as Microsoft calls it, an application server. For example, the host header name for the URL www.ideva.com is www.ideva.com. An HTML 3.0 or later browser supports HTTP 1.1. The browser includes the host header name you specified in the location field of the request header that the browser sends to the server. If you don't specify a host header name in the request header, the root Web domain acts as the default Web server.