REST - entity (hash) version in URL vs etags for c

2019-05-08 15:50发布

I am designing a REST API and lately I put some thought on how to make most of caching for dynamic content (after the response that I got on this topic), while respecting the principles of HTTP (and thus REST).

Obviously the canonical solution (at least in my understanding) is to use etags, but this will not decrease the number of requests in any way, just the size.

I was thinking of embedding a version in the URL (it will be server produced, based on the actual content - be it serial number or some hash). I will explain the scheme and the user scenario and how I think it will help, and then ask my questions.

Setup

GET /entity/{id}/

returns temporary redirect to /entity/{id}/{current_version} and no-cache headers.

GET /entity/{latest_version}/

returns OK response with cache forever.

GET /entity/{old_version}/

returns 410 Gone (I don't want to actually keep old versions).

GET /entity/?[query]

is some search that returns a list of links to current versions of result entities. No cache.

Use scenario and how I think it would help

User application (AJAX) will always start with some kind of query, then it has to pull the descriptions of entities. Since it is expected that changes for a single client result set will not be very dynamic, it seems good idea to use the above scheme and client pull fresh results from the query every time, but if most of the entities did not change since last visit, they will be already cached in browser. If this hypothesis is true, this will lead to significant decrease in the number of requests, as well as total size.

Using etags would result in much simpler URI scheme, but probably more complicated and heavy server side implementation.

Notes and questions

1 I know somebody will propose that /entity/{id}/ should be a collection that returns list versions, but versions are not actually stored, useful or desired. It is more a synonym for the latest one. My question here is if somebody sees any problem with that, besides general principles. This is protected API, I do not care about SEO in this case and it is transparent for client. Actually, as API will be more or less hyperlinked, it is not expected to actually call /entity/{id}/ directly normally, but use whatever results returns. It can be used, for example for context free links.

2 I have some doubts for 410 Gone for old versions. On one hand this version is not available anymore and clients should not be accessing it anyway. On the other hand, if client asks for it after all (for whatever reason), it may make sense to return permanent redirect to /entity/{id}/ (probably better that temporary redirect to current version).

3 Speaking of redirects. 301 is cemented for permanent redirect, but is 302 the best choice for temporary? Most important is browser support (it will be AJAX).

4 Of course, the main issue is the usage of URLs instead of etags for caching (hoping on the browser caches). If somebody has real experience under high load (relative to servers capabilities, cough), I will appreciate sharing it.

Additional notes

After some more research there is an issue with versioned resources and it is propagation of updates for linked resources. There are two options:

  1. Link a specific version of the resource. This means that server side logic will be heavy and cumbersome, as updates have to be propagated for linked resources through reverse links;

  2. Link the /latest/ version. This means that even if both resource and linked resourced concrete versions are cached locally, clients (browsers) will have to make a request to /latest/ in order to 'check' latest version of a linked resource. Of course it is a small request (only redirect) and if resource didn't change location is already cached. One problem may be that resources are often pulled from such links (in opposite to query result to particular version). Another (much worse) problem is that actually old version of the resource is linking the newest version of another - it can be data inconsistency (i.e. somebody edited document and also changed a linked attachment - client will have old version of the document and new one for the attachment).

Both options are unsatisfactory. In this light caching of dynamic data is possible only for 'leaf' level resources - ones that do not link to any other, bust just have direct attribute values.

Final notes

After research and discussions, versioned resources are not the brightest idea as general architecture. After measurement and given the opportunity, something can be retrofitted in a canonical API for 'plain' resources. I would accept Roysvork's comment (' It is my opinion that the reason this is difficult is that it is not really a very good idea.') as solution, if it was a separate answer :)

标签: rest caching
0条回答
登录 后发表回答