Consider a web server from which three virtual hosts are served:
- mysite.com
- myothersite.com
- imnotcreative.com
Now assume that the server receives the following raw request message (code formatting removes the terminating \r\n
sequences):
GET / HTTP/1.1
Host: nothostedhere.com
I haven't see any guidance in RFC 2616 (perhaps I missed it?) on how to respond to a request for a host name that does not exist at the current server. Apache, for example, will simply use the first virtual host defined in its configuration as the "primary host" and pretend the client requested that host. Obviously this is more robust than returning a 400 Bad Request
response and guarantees the client always sees some representation.
So my question is ...
Can anyone provide reasons aside from the "robustness vs. correctness" argument to dissuade me from responding with a 400
(or other error code) should the client request a non-existent host when employing the HTTP/1.1 protocol?
Note that all HTTP/1.1 requests MUST specify a Host:
header as per RFC 2616. For HTTP/1.0 requests the only real option is to serve the "primary" host result. This question specifically addresses HTTP/1.1 protocol requests.
400 is not really the semantically correct response code in this scenario.
10.4.1 400 Bad Request
The request could not be understood by the server due to malformed syntax.
This is not what has happened. The request is syntactically valid, and by the time you server has reached the routing phase (when you are inspecting the value of the header) this will already have been determined.
I would say the correct response code here is 403:
10.4.4 403 Forbidden
The server understood the request, but is refusing to fulfill it.
This describes what has happened more accurately. The server is refusing to fulfill the request because it is unable to, and a more verbose error message can be provided in the message entity.
There is also an argument that 404 would be acceptable/correct, since a suitable document with which to satisfy the request could not be found, but personally I think that this is not the correct option, because 404 states:
10.4.5 404 Not Found
The server has not found anything matching the Request-URI
This explicitly mentions a problem with the Request-URI, and at this early stage of the routing phase you are probably not interested in the URI, since you first need to allocate the request to a host before it can determine whether it has a suitable document to handle the URI path.
In HTTP/1.1 Host:
headers are mandatory. If a client states that it is using version 1.1 and does not supply a Host:
header then 400 is definitely the correct response code. If the client states that it is using version 1.0 then it is not required to supply a host header and this should be handled gracefully - and this scenario amounts to the same situation as an unrecognised domain.
Really you have two options in this event: route the request to a default virtual host container, or respond with an error. As outlined above, if you are going to respond with an error, I believe the error should be 403.
I'd say this largely depends on what type(s) of clients you expect to consume your service and the type of service you offer.
For a general website:
Pretty safe to assume that requests are triggered from a user's browser, in which case I'd be more forgiving regarding the lack or incorrectness of a Host:
header. I'd even go so far and say that the way Apache handles the case (i.e. fallback to the first appropriate VHost) is perfectly fine. After all, you don't want to scare your customers away.
For an API/RPC type of service:
That's a totally different case. You SHOULD expect whatever/whoever consumes your service to adhere to your specifications. So, if these require a consumer to pass a valid Host:
header and the consumer fails to do so, you SHOULD return with a reasonable response (400 Bad Request
seems fine to me).