Why does an HTTP PUT request have to contain a representation of a 'whole' state and can't just be a partial?
I understand that this is the existing definition of PUT - this question is about the reason(s) why it would be defined that way.
i.e:
What is gained by preventing partial PUTs?
Why was preventing idempotent partial updates considered an acceptable loss?
PUT means what the HTTP spec defines it to mean. Clients and servers cannot change that meaning. If clients or servers use PUT in a way that contradicts its definition, at least the following thing might happen:
Put is by definition idempotent. That means a client (or intermediary!) can repeat a PUT any number of times and be sure that the effect will be the same. Suppose an intermediary receives a PUT request from a client. When it forwards the request to the server, there is a network problem. The intermediary knows by definition that it can retry the PUT until it succeeds. If the server uses PUT in a non idempotent way these potential multiple calls will have an undesired effect.
If you want to do a partial update, use PATCH or use POST on a sub-resource and return 303 See Other to the 'main' resource, e.g.
EDIT: On the general question why partial updates cannot be idempotent:
A partial update cannot be idempotent in general because the idempotency depends on the media type semantics. IOW, you might be able to specify a format that allows for idempotent patches, but PATCH cannot be guaranteed to be idempotent for every case. Since the semantics of a method cannot be a function of the media type (for orthogonality reasons) PATCH needs to be defined as non-idempotent. And PUT (being defined as idempotent) cannot be used for partial updates.
With a full document update, it's obvious, without knowing any details of the particular API or what its limitations on the document structure are, what the resulting document will be after the update.
If a certain method was known to never be a partial content update, and an API someone provided only supported that method, then it would always be clear what someone using the API would have to do to change a document to have a given set of valid contents.
Short answer: ACIDity of the PUT operation and the state of the updated entity.
Long answer:
RFC 2616 : Paragraph 2.5, "POST method requests the enclosed entity to be accepted as a new subordinate of the requested URL". Paragraph 2.6, "PUT method requests the enclosed entity to be stored at the specified URL".
Since every time you execute POST, the semantic is to create a new entity instance on the server, POST constitutes an ACID operation. But repeating the same POST twice with the same entity in the body still might result in different outcome, if for example the server has run out of storage to store the new instance that needs to be created - thus, POST is not idempotent.
PUT on the other hand has a semantic of updating an existing entity. There's no guarantee that even if a partial update is idempotent, it is also ACID and results in consistent and valid entity state. Thus, to ensure ACIDity, PUT semantic requires the full entity to be sent. Even if it was not a goal for the HTTP protocol authors, the idempotency of the PUT request would happen as a side effect of the attempt to enforce ACID.
Of course, if the HTTP server has close knowledge of the semantic of the entities, it can allow partial PUTs, since it can ensure through server-side logic the consistency of the entity. This however requires tight coupling between the data and the server.
Because, I guess, this would have translated in inconsistent "views" when multiple concurrent clients access the state. There isn't a "partial document" semantics in REST as far as I can tell and probably the benefits of adding this in face of the complexity of dealing with that semantics in the context of concurrency wasn't worth the effort.
If the document is big, there is nothing preventing you from building multiple independent documents and have an overarching document that ties them together. Furthermore, once all the bits and pieces are collected, a new document can be collated on the server I guess.
So, considering one can "workaround" this "limitations", I can understand why this feature didn't make the cut.