Patterns for handling batch operations in REST web

2019-01-05 07:37发布

What proven design patterns exist for batch operations on resources within a REST style web service?

I'm trying to be strike a balance between ideals and reality in terms of performance and stability. We've got an API right now where all operations either retrieve from a list resource (ie: GET /user) or on a single instance (PUT /user/1, DELETE /user/22, etc).

There are some cases where you want to update a single field of a whole set of objects. It seems very wasteful to send the entire representation for each object back and forth to update the one field.

In an RPC style API, you could have a method:

/mail.do?method=markAsRead&messageIds=1,2,3,4... etc. 

What's the REST equivalent here? Or is it ok to compromise now and then. Does it ruin the design to add in a few specific operations where it really improves the performance, etc? The client in all cases right now is a Web Browser (javascript application on the client side).

8条回答
一夜七次
2楼-- · 2019-01-05 08:07

Your language, "It seems very wasteful...", to me indicates an attempt at premature optimization. Unless it can be shown that sending the entire representation of objects is a major performance hit (we're talking unacceptable to users as > 150ms) then there's no point in attempting to create a new non-standard API behaviour. Remember, the simpler the API the easier it is to use.

For deletes send the following as the server doesn't need to know anything about the state of the object before the delete occurs.

DELETE /emails
POSTDATA: [{id:1},{id:2}]

The next thought is that if an application is running into performance issues regarding the bulk update of objects then consideration into breaking each object up into multiple objects should be given. That way the JSON payload is a fraction of the size.

As an example when sending a response to update the "read" and "archived" statuses of two separate emails you would have to send the following:

PUT /emails
POSTDATA: [
            {
              id:1,
              to:"someone@bratwurst.com",
              from:"someguy@frommyville.com",
              subject:"Try this recipe!",
              text:"1LB Pork Sausage, 1 Onion, 1T Black Pepper, 1t Salt, 1t Mustard Powder",
              read:true,
              archived:true,
              importance:2,
              labels:["Someone","Mustard"]
            },
            {
              id:2,
              to:"someone@bratwurst.com",
              from:"someguy@frommyville.com",
              subject:"Try this recipe (With Fix)",
              text:"1LB Pork Sausage, 1 Onion, 1T Black Pepper, 1t Salt, 1T Mustard Powder, 1t Garlic Powder",
              read:true,
              archived:false,
              importance:1,
              labels:["Someone","Mustard"]
            }
            ]

I would split out the mutable components of the email (read, archived, importance, labels) into a separate object as the others (to, from, subject, text) would never be updated.

PUT /email-statuses
POSTDATA: [
            {id:15,read:true,archived:true,importance:2,labels:["Someone","Mustard"]},
            {id:27,read:true,archived:false,importance:1,labels:["Someone","Mustard"]}
          ]

Another approach to take is to leverage the use of a PATCH. To explicitly indicate which properties you are intending to update and that all others should be ignored.

PATCH /emails
POSTDATA: [
            {
              id:1,
              read:true,
              archived:true
            },
            {
              id:2,
              read:true,
              archived:false
            }
          ]

People state that PATCH should be implemented by providing an array of changes containing: action (CRUD), path (URL), and value change. This may be considered a standard implementation but if you look at the entirety of a REST API it is a non-intuitive one-off. Also, the above implementation is how GitHub has implemented PATCH.

To sum it up, it is possible to adhere to RESTful principles with batch actions and still have acceptable performance.

查看更多
3楼-- · 2019-01-05 08:12

A simple RESTful pattern for batches is to make use of a collection resource. For example, to delete several messages at once.

DELETE /mail?&id=0&id=1&id=2

It's a little more complicated to batch update partial resources, or resource attributes. That is, update each markedAsRead attribute. Basically, instead of treating the attribute as part of each resource, you treat it as a bucket into which to put resources. One example was already posted. I adjusted it a little.

POST /mail?markAsRead=true
POSTDATA: ids=[0,1,2]

Basically, you are updating the list of mail marked as read.

You can also use this for assigning several items to the same category.

POST /mail?category=junk
POSTDATA: ids=[0,1,2]

It's obviously much more complicated to do iTunes-style batch partial updates (e.g., artist+albumTitle but not trackTitle). The bucket analogy starts to break down.

POST /mail?markAsRead=true&category=junk
POSTDATA: ids=[0,1,2]

In the long run, it's much easier to update a single partial resource, or resource attributes. Just make use of a subresource.

POST /mail/0/markAsRead
POSTDATA: true

Alternatively, you could use parameterized resources. This is less common in REST patterns, but is allowed in the URI and HTTP specs. A semicolon divides horizontally related parameters within a resource.

Update several attributes, several resources:

POST /mail/0;1;2/markAsRead;category
POSTDATA: markAsRead=true,category=junk

Update several resources, just one attribute:

POST /mail/0;1;2/markAsRead
POSTDATA: true

Update several attributes, just one resource:

POST /mail/0/markAsRead;category
POSTDATA: markAsRead=true,category=junk

The RESTful creativity abounds.

查看更多
登录 后发表回答