Some HTTP methods, such as POST
, require a body to be sent after the headers and the double CRLF
.
Others, such as GET
, do not have a body, and for them the double CRLF
marks the end of the request.
But what about others: PUT
, DELETE
, ... how to know which one requires a body?
How should a generic HTTP client react to an unknown HTTP method? Reject it? Require a body by default, or not require a body by default?
A pointer to the relevant spec would be appreciated.
Edit : I'll detail a bit more my question, as asked in the comments.
I'm designing a generic HTTP client that a programmer can use to send arbitrary HTTP requests to any server.
The client could be used like this (pseudo-code):
HttpClient.request(method, url [, data]);
The data is optional, and can be raw data (string), or an associative array of key/value pairs.
The library would url-encode the data if it's an array, then either append the data to the URL for a GET
request, or send it in the message body for a POST
request.
I'm therefore trying to determine whether this HttpClient must/should/must not/should not include a message-body in the request, given the HTTP method chosen by the developer.
EDIT: compiled list:
- an entity-body is only present when a message-body is present (section 7.2)
- the presence of a message-body is signaled by the inclusion of a
Content-Length
or Transfer-Encoding
header (section 4.3)
- a message-body must not be included when the specification of the request method does not allow sending an entity-body (section 4.3)
- an entity-body is explicitly forbidden in TRACE requests only, all other request types are unrestricted (section 9, and 9.8 specifically)
For responses, this has been defined:
- whether a message-body is included depends on both request method and response status (section 4.3)
- a message-body is explicitly forbidden in responses to HEAD requests (section 9, and 9.4 specifically)
- a message-body is explicitly forbidden in 1xx (informational), 204 (no content), and 304 (not modified) responses (section 4.3)
- all other responses include a message-body, though it may be of zero length (section 4.3)
This (RFC 7231) Or This version (From IETF & More In-Depth) is what you want. According to the RFC:
For PUT
:
The PUT method requests that the enclosed entity be stored under the
supplied Request-URI. If the Request-URI refers to an already existing
resource, the enclosed entity SHOULD be considered as a modified
version of the one residing on the origin server. If the Request-URI
does not point to an existing resource, and that URI is capable of
being defined as a new resource by the requesting user agent, the
origin server can create the resource with that URI. If a new resource
is created, the origin server MUST inform the user agent via the 201
(Created) response. If an existing resource is modified, either the
200 (OK) or 204 (No Content) response codes SHOULD be sent to indicate
successful completion of the request. If the resource could not be
created or modified with the Request-URI, an appropriate error
response SHOULD be given that reflects the nature of the problem. The
recipient of the entity MUST NOT ignore any Content-* (e.g.
Content-Range) headers that it does not understand or implement and
MUST return a 501 (Not Implemented) response in such cases.
And for DELETE
:
The DELETE method requests that the origin server delete the resource
identified by the Request-URI. This method MAY be overridden by human
intervention (or other means) on the origin server. The client cannot
be guaranteed that the operation has been carried out, even if the
status code returned from the origin server indicates that the action
has been completed successfully. However, the server SHOULD NOT
indicate success unless, at the time the response is given, it intends
to delete the resource or move it to an inaccessible location.
A successful response SHOULD be 200 (OK) if the response includes an
entity describing the status, 202 (Accepted) if the action has not yet
been enacted, or 204 (No Content) if the action has been enacted but
the response does not include an entity.
If the request passes through a cache and the Request-URI identifies
one or more currently cached entities, those entries SHOULD be treated
as stale. Responses to this method are not cacheable.
From your comments I get you're writing an HTTP client library (why, aren't there enough?) and you want to allow for a generic request(method, url[, data])
method. You want to know for what method
the data
is either required or forbidden.
Just assume the user of your library knows what they're doing. If I want to send a body with a GET request I can, because the spec doesn't forbid that. So why should your library?
Furthermore the HTTP spec is open in this; an extension to HTTP (like WebDAV) can specify new methods (verbs) that do or don't allow or even require a message body.
I think the current effort can better be spent on more important parts.
I'm going to answer this:
- From the perspective of request-bodies, not response-bodies, as that is what is asked, and of most interest.
- In terms of when the body is required, and when it is forbidden.
No request is required to include a body, although the absence of a body might be interpreted as an empty body or one of zero length.
RFC2616 4.3 states:
4.3 Message Body
The rules for when a message-body is allowed in a message differ for
requests and responses.
...
A message-body MUST NOT be included in a request if the specification of the request method (section 5.1.1) does not allow sending an entity-body in requests.
Going through the methods in 5.1.1 (excluding any extension-methods) you will find:
9.8 TRACE
...
A TRACE request MUST NOT include an entity.
So technically any of the other request methods:
OPTIONS
GET
HEAD
POST
PUT
DELETE
CONNECT
... could include a body. Back to 4.3:
if the request method
does not include defined semantics for an entity-body, then the
message-body SHOULD be ignored when handling the request.
So in-response to an unexpected entity-body for a particular method or resource, it is safe to ignore it and respond, including the response-code, as if the body was not sent.
Reference: RFC2616 Hypertext Transfer Protocol -- HTTP/1.1
Edit: RFC2616 is well and truly obsolete, refer to RFC7230 for the latest specification.
For arbitrary methods, or valid method which you don't want to support at server side HTTP Status Code 405
should be sent back to caller.
As per http://en.wikipedia.org/wiki/List_of_HTTP_status_codes:
405 Method Not Allowed A request was made of a resource using a
request method not supported by that resource;[2] for example, using
GET on a form which requires data to be presented via POST, or using
PUT on a read-only resource.
You may want to read the current HTTP spec draft's section about the message body length: http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p1-messaging-22.html#message.body.length