Patterns for handling batch operations in REST web

2019-01-05 07:37发布

What proven design patterns exist for batch operations on resources within a REST style web service?

I'm trying to be strike a balance between ideals and reality in terms of performance and stability. We've got an API right now where all operations either retrieve from a list resource (ie: GET /user) or on a single instance (PUT /user/1, DELETE /user/22, etc).

There are some cases where you want to update a single field of a whole set of objects. It seems very wasteful to send the entire representation for each object back and forth to update the one field.

In an RPC style API, you could have a method:

/mail.do?method=markAsRead&messageIds=1,2,3,4... etc. 

What's the REST equivalent here? Or is it ok to compromise now and then. Does it ruin the design to add in a few specific operations where it really improves the performance, etc? The client in all cases right now is a Web Browser (javascript application on the client side).

8条回答
萌系小妹纸
2楼-- · 2019-01-05 07:56

Great post. I've been searching for a solution for a few days. I came up with a solution of using passing a query string with a bunch IDs separated by commas, like:

DELETE /my/uri/to/delete?id=1,2,3,4,5

...then passing that to a WHERE IN clause in my SQL. It works great, but wonder what others think of this approach.

查看更多
爷的心禁止访问
3楼-- · 2019-01-05 07:57

Not at all -- I think the REST equivalent is (or at least one solution is) almost exactly that -- a specialized interface designed accommodate an operation required by the client.

I'm reminded of a pattern mentioned in Crane and Pascarello's book Ajax in Action (an excellent book, by the way -- highly recommended) in which they illustrate implementing a CommandQueue sort of object whose job it is to queue up requests into batches and then post them to the server periodically.

The object, if I remember correctly, essentially just held an array of "commands" -- e.g., to extend your example, each one a record containing a "markAsRead" command, a "messageId" and maybe a reference to a callback/handler function -- and then according to some schedule, or on some user action, the command object would be serialized and posted to the server, and the client would handle the consequent post-processing.

I don't happen to have the details handy, but it sounds like a command queue of this sort would be one way to handle your problem; it'd reduce the overall chattiness substantially, and it'd abstract the server-side interface in a way you might find more flexible down the road.


Update: Aha! I've found a snip from that very book online, complete with code samples (although I still suggest picking up the actual book!). Have a look here, beginning with section 5.5.3:

This is easy to code but can result in a lot of very small bits of traffic to the server, which is inefficient and potentially confusing. If we want to control our traffic, we can capture these updates and queue them locally and then send them to the server in batches at our leisure. A simple update queue implemented in JavaScript is shown in listing 5.13. [...]

The queue maintains two arrays. queued is a numerically indexed array, to which new updates are appended. sent is an associative array, containing those updates that have been sent to the server but that are awaiting a reply.

Here are two pertinent functions -- one responsible for adding commands to the queue (addCommand), and one responsible for serializing and then sending them to the server (fireRequest):

CommandQueue.prototype.addCommand = function(command)
{ 
    if (this.isCommand(command))
    {
        this.queue.append(command,true);
    }
}

CommandQueue.prototype.fireRequest = function()
{
    if (this.queued.length == 0)
    { 
        return; 
    }

    var data="data=";

    for (var i = 0; i < this.queued.length; i++)
    { 
        var cmd = this.queued[i]; 
        if (this.isCommand(cmd))
        {
            data += cmd.toRequestString(); 
            this.sent[cmd.id] = cmd;

            // ... and then send the contents of data in a POST request
        }
    }
}

That ought to get you going. Good luck!

查看更多
\"骚年 ilove
4楼-- · 2019-01-05 08:00

I would be tempted in an operation like the one in your example to write a range parser.

It's not a lot of bother to make a parser that can read "messageIds=1-3,7-9,11,12-15". It would certainly increase efficiency for blanket operations covering all messages and is more scalable.

查看更多
成全新的幸福
5楼-- · 2019-01-05 08:03

From my point of view I think Facebook has the best implementation.

A single HTTP request is made with a batch parameter and one for a token.

In batch a json is sent. which contains a collection of "requests". Each request has a method property (get / post / put / delete / etc ...), and a relative_url property (uri of the endpoint), additionally the post and put methods allow a "body" property where the fields to be updated are sent .

more info at: Facebook batch API

查看更多
乱世女痞
6楼-- · 2019-01-05 08:04

While I think @Alex is along the right path, conceptually I think it should be the reverse of what is suggested.

The URL is in effect "the resources we are targeting" hence:

    [GET] mail/1

means get the record from mail with id 1 and

    [PATCH] mail/1 data: mail[markAsRead]=true

means patch the mail record with id 1. The querystring is a "filter", filtering the data returned from the URL.

    [GET] mail?markAsRead=true

So here we are requesting all the mail already marked as read. So to [PATCH] to this path would be saying "patch the records already marked as true"... which isn't what we are trying to achieve.

So a batch method, following this thinking should be:

    [PATCH] mail/?id=1,2,3 <the records we are targeting> data: mail[markAsRead]=true

of course I'm not saying this is true REST (which doesnt permit batch record manipulation), rather it follows the logic already existing and in use by REST.

查看更多
仙女界的扛把子
7楼-- · 2019-01-05 08:05

The google drive API has a really interesting system to solve this problem (see here).

What they do is basically grouping different requests in one Content-Type: multipart/mixed request, with each individual complete request separated by some defined delimiter. Headers and query parameter of the batch request are inherited to the individual requests (i.e. Authorization: Bearer some_token) unless they are overridden in the individual request.


Example: (taken from their docs)

Request:

POST https://www.googleapis.com/batch

Accept-Encoding: gzip
User-Agent: Google-HTTP-Java-Client/1.20.0 (gzip)
Content-Type: multipart/mixed; boundary=END_OF_PART
Content-Length: 963

--END_OF_PART
Content-Length: 337
Content-Type: application/http
content-id: 1
content-transfer-encoding: binary


POST https://www.googleapis.com/drive/v3/files/fileId/permissions?fields=id
Authorization: Bearer authorization_token
Content-Length: 70
Content-Type: application/json; charset=UTF-8


{
  "emailAddress":"example@appsrocks.com",
  "role":"writer",
  "type":"user"
}
--END_OF_PART
Content-Length: 353
Content-Type: application/http
content-id: 2
content-transfer-encoding: binary


POST https://www.googleapis.com/drive/v3/files/fileId/permissions?fields=id&sendNotificationEmail=false
Authorization: Bearer authorization_token
Content-Length: 58
Content-Type: application/json; charset=UTF-8


{
  "domain":"appsrocks.com",
   "role":"reader",
   "type":"domain"
}
--END_OF_PART--

Response:

HTTP/1.1 200 OK
Alt-Svc: quic=":443"; p="1"; ma=604800
Server: GSE
Alternate-Protocol: 443:quic,p=1
X-Frame-Options: SAMEORIGIN
Content-Encoding: gzip
X-XSS-Protection: 1; mode=block
Content-Type: multipart/mixed; boundary=batch_6VIxXCQbJoQ_AATxy_GgFUk
Transfer-Encoding: chunked
X-Content-Type-Options: nosniff
Date: Fri, 13 Nov 2015 19:28:59 GMT
Cache-Control: private, max-age=0
Vary: X-Origin
Vary: Origin
Expires: Fri, 13 Nov 2015 19:28:59 GMT

--batch_6VIxXCQbJoQ_AATxy_GgFUk
Content-Type: application/http
Content-ID: response-1


HTTP/1.1 200 OK
Content-Type: application/json; charset=UTF-8
Date: Fri, 13 Nov 2015 19:28:59 GMT
Expires: Fri, 13 Nov 2015 19:28:59 GMT
Cache-Control: private, max-age=0
Content-Length: 35


{
 "id": "12218244892818058021i"
}


--batch_6VIxXCQbJoQ_AATxy_GgFUk
Content-Type: application/http
Content-ID: response-2


HTTP/1.1 200 OK
Content-Type: application/json; charset=UTF-8
Date: Fri, 13 Nov 2015 19:28:59 GMT
Expires: Fri, 13 Nov 2015 19:28:59 GMT
Cache-Control: private, max-age=0
Content-Length: 35


{
 "id": "04109509152946699072k"
}


--batch_6VIxXCQbJoQ_AATxy_GgFUk--
查看更多
登录 后发表回答