What proven design patterns exist for batch operations on resources within a REST style web service?
I'm trying to be strike a balance between ideals and reality in terms of performance and stability. We've got an API right now where all operations either retrieve from a list resource (ie: GET /user) or on a single instance (PUT /user/1, DELETE /user/22, etc).
There are some cases where you want to update a single field of a whole set of objects. It seems very wasteful to send the entire representation for each object back and forth to update the one field.
In an RPC style API, you could have a method:
/mail.do?method=markAsRead&messageIds=1,2,3,4... etc.
What's the REST equivalent here? Or is it ok to compromise now and then. Does it ruin the design to add in a few specific operations where it really improves the performance, etc? The client in all cases right now is a Web Browser (javascript application on the client side).
Your language, "It seems very wasteful...", to me indicates an attempt at premature optimization. Unless it can be shown that sending the entire representation of objects is a major performance hit (we're talking unacceptable to users as > 150ms) then there's no point in attempting to create a new non-standard API behaviour. Remember, the simpler the API the easier it is to use.
For deletes send the following as the server doesn't need to know anything about the state of the object before the delete occurs.
The next thought is that if an application is running into performance issues regarding the bulk update of objects then consideration into breaking each object up into multiple objects should be given. That way the JSON payload is a fraction of the size.
As an example when sending a response to update the "read" and "archived" statuses of two separate emails you would have to send the following:
I would split out the mutable components of the email (read, archived, importance, labels) into a separate object as the others (to, from, subject, text) would never be updated.
Another approach to take is to leverage the use of a PATCH. To explicitly indicate which properties you are intending to update and that all others should be ignored.
People state that PATCH should be implemented by providing an array of changes containing: action (CRUD), path (URL), and value change. This may be considered a standard implementation but if you look at the entirety of a REST API it is a non-intuitive one-off. Also, the above implementation is how GitHub has implemented PATCH.
To sum it up, it is possible to adhere to RESTful principles with batch actions and still have acceptable performance.
A simple RESTful pattern for batches is to make use of a collection resource. For example, to delete several messages at once.
It's a little more complicated to batch update partial resources, or resource attributes. That is, update each markedAsRead attribute. Basically, instead of treating the attribute as part of each resource, you treat it as a bucket into which to put resources. One example was already posted. I adjusted it a little.
Basically, you are updating the list of mail marked as read.
You can also use this for assigning several items to the same category.
It's obviously much more complicated to do iTunes-style batch partial updates (e.g., artist+albumTitle but not trackTitle). The bucket analogy starts to break down.
In the long run, it's much easier to update a single partial resource, or resource attributes. Just make use of a subresource.
Alternatively, you could use parameterized resources. This is less common in REST patterns, but is allowed in the URI and HTTP specs. A semicolon divides horizontally related parameters within a resource.
Update several attributes, several resources:
Update several resources, just one attribute:
Update several attributes, just one resource:
The RESTful creativity abounds.