I am writing a REST API for a service that will accept user contributed data. I would like to keep all operations completely asynchronous, this includes PUT, POST, DELETE and perhaps even GET requests. My idea is to receive the request, process it enough to ensure it is a valid request and then pass a HTTP 202 accepted response along with a url where the data will eventually be available and a token so that subsequent requests can be matched to processed data. If the request is invalid then I will send a HTTP 400.
The client will then be responsible to check the url I provided them at some time in the future and pass along the token. If the data is available I return a normal 200 or 201 but if I am still processing the request I will send another 202 indicating the processing hasn't completed. In case of errors processing the data I will send 4xx or 5xx status as necessary.
The reason I want to do this is so I can dump all valid requests into a request pool and have workers pull from the queue and process requests as they are available. Since I don't know the pool size or number of workers available I can't be certain that I can get to requests fast enough to satisfy the 30 second limit of Google App Engine.
My question is: am I perverting REST by processing requests in this manner? Browsers, for instance, seem to require immediate responses to requests. For my HTML pages I plan to respond with the structured page and then use AJAX to process the data requests.
I'm mostly interested in any opinions or experience in processing data using REST in this manner.
This is a really old question, but I would like to offer up a slightly different view of this, which I do not claim to be correct, just my view.
From the client perspective
Let's start off with the initial HTTP request. First and foremost, the request should be POST. You are sending a message to the server to create a resource. GET and PUT are not valid in this case because:
From the service perspective
So now you are sending a POST to the server to process a request. The server has really 3 possible return values (not including the 4xx and 5xx errors):
When the service has completed the request successfully, it will create the resource at the location that was returned to the client.
Now this is where I start seeing things a little different from the response above.
If the service fails to complete the request, it should still create a resource at the location that was returned to the client. This resource should indicate the reason for the failure. It much more flexible to have a resource provide failure information than trying to shoe-horn it into the HTTP protocol.
If the service gets the request for this resource before it is completed, it should return a "404 Not Found". The reason I believe that it should be a "404 Not Found" is because it really does not exist. The HTTP specifications do not say that "404 Not Found" can only be used for when a resource is never going to exist, just that it doesn't exist there now. This type of response to an asynchronous polling flow is completely correct in my opinion.
There is also the scenario of when a resource is supposed to only be there for a fixed time. For example, it may be data based on a source that is refreshed nightly. What should happen in these cases is that the resource should be removed, but an indicator should be provided to the service that it can know to return a "410 Gone" status code. This basically is telling the client that the resource was here, but is no longer available (ie: may have expired). The typical action from the client would be to resubmit the request.
From the client perspective again
When the client gets the response for it's initial POST, it gets the "Location" and makes the request to the service using that URL using a GET (again, not POST). The service will generally response with these values:
The one thing that needs to be pointed out is that the resource that is returned is generally in a format that can define success and failure responses. The client should be able to determine from this resource if there was an error, what it was, and be able to respond accordingly.
Also, the service developer may make it so that service expires and deletes the error resource after a short period of time.
So that's my thoughts on this question. It's very late to the party, but hopefully future readers may see a slightly different view to a commonly asked question.
FWIW, Microsoft Flow uses a pattern like this.
First call returns 202 w/ Location header. Followup calls return either: 1. If still processing --> 202 w/ a location header. The loc header can be different, which provides a way to pass state between calls (and potentially make the server stateless!). 2. If done --> 200.
Details at: https://github.com/jeffhollan/LogicAppsAsyncResponseSample
Adding my two cents to an old question. My idea is similar to systempuntoout and Avi Flax's suggestions.
I agree that a
HTTP 202
response is appropriate for the initial request with a redirect to another resource via aLocation
header.I think the
Location
URL should probably include the token you reference to conform to common expectations of aLocation
redirect. For exampleLocation: /queue?token={unique_token}
orLocation: /task/{unique_token}
.I also think the resource used to check the status of the process should return a
HTTP 200
response when the action of "checking the status" is successful (not aHTTP 202
because that implies the current request was "accepted").However, I think when the new entity is created "checking the status" should return a
HTTP 303
(See Other) response with aLocation
header for the new entity once it has been created. This is more appropriate than sending aHTTP 201
because nothing was created due to theGET
request just performed to check status.I also think the resource used to check the status should return error codes appropriately. Whenever "checking the status" is performed successfully, an appropriate success code should be returned. Errors can be handled at the application level (by checking the response body).
I think that your solution is fine, the
Http status 202
is the proper response to use in this specific case indicating that the request has been accepted for processing, but the processing has not been completed.What I would slightly change in your workflow are the
Http status
of the subsequent requests.As you said, the
202 response
should return aLocation header
specifying the URL that client should use to monitor the status of its previous request.Calling this Check-the-status-of-my-process URL, instead of returning a 202 in case of process pending, I would return:
200 OK
when the requested process is still pending. The Response should describe the pending status of the process.201 Created
when the processing has been completed. The Response in case of GET/PUT/POST should contain the Location to the requested/created/updated resource.