Best way to cache RESTful API results of GET calls

2019-01-31 13:17发布

I'm thinking about the best way to create a cache layer in front or as first layer for GET requests to my RESTful API (written in Ruby).

Not every request can be cached, because even for some GET requests the API has to validate the requesting user / application. That means I need to configure which request is cacheable and how long each cached answer is valid. For a few cases I need a very short expiration time of e.g. 15s and below. And I should be able to let cache entries expire by the API application even if the expiration date is not reached yet.

I already thought about many possible solutions, my two best ideas:

  • first layer of the API (even before the routing), cache logic by myself (to have all configuration options in my hand), answers and expiration date stored to Memcached

  • a webserver proxy (high configurable), perhaps something like Squid but I never used a proxy for a case like this before and I'm absolutely not sure about it

I also thought about a cache solution like Varnish, I used Varnish for "usual" web applications and it's impressive but the configuration is kind of special. But I would use it if it's the fastest solution.

An other thought was to cache to the Solr Index, which I'm already using in the data layer to not query the database for most requests.

If someone has a hint or good sources to read about this topic, let me know.

5条回答
家丑人穷心不美
2楼-- · 2019-01-31 13:23

Redis Cache is best option. check here.

It is open source. Advanced key-value cache and store.

查看更多
3楼-- · 2019-01-31 13:27

Firstly, build your RESTful API to be RESTful. That means authenticated users can also get cached content as to keep all state in the URL it needs to contain the auth details. Of course the hit rate will be lower here, but it is cacheable.

With a good deal of logged in users it will be very beneficial to have some sort of model cache behind a full page cache as many models are still shared even if some aren't (in a good OOP structure).

Then for a full page cache you are best of to keep all the requests off the web server and especially away from the dynamic processing in the next step (in your case Ruby). The fastest way to cache full pages from a normal web server is always a caching proxy in front of the web servers.

Varnish is in my opinion as good and easy as it gets, but some prefer Squid indeed.

查看更多
\"骚年 ilove
4楼-- · 2019-01-31 13:35

Since REST is an HTTP thing, it could be that the best way of caching requests is to use HTTP caching.

Look into using ETags on your responses, checking the ETag in requests to reply with '304 Not Modified' and having Rack::Cache to serve cached data if the ETags are the same. This works great for cache-control 'public' content.

Rack::Cache is best configured to use memcache for its storage needs.

I wrote a blog post last week about the interesting way that Rack::Cache uses ETags to detect and return cached content to new clients: http://blog.craz8.com/articles/2012/12/19/rack-cache-and-etags-for-even-faster-rails

Even if you're not using Rails, the Rack middleware tools are quite good for this stuff.

查看更多
兄弟一词,经得起流年.
5楼-- · 2019-01-31 13:46

memcached is a great option, and I see you mentioned this already as a possible option. Also Redis seems to be praised a lot as another option at this level.

On an application level, in terms of a more granular approach to cache on a file by file and/or module basis, local storage is always an option for common objects a user may request over and over again, even as simple as just dropping response objects into session so that can be reused vs making another http rest call and coding appropriately.

Now people go back and forth debating about varnish vs squid, and both seem to have their pros and cons, so I can't comment on which one is better but many people say Varnish with a tuned apache server is great for dynamic websites.

查看更多
乱世女痞
6楼-- · 2019-01-31 13:49

I’ve used redis successfully this way in my REST view:

from django.conf import settings
import hashlib
import json
from redis import StrictRedis
from django.utils.encoding import force_bytes

def get_redis():
    #get redis connection from RQ config in settings
    rc = settings.RQ_QUEUES['default']
    cache = StrictRedis(host=rc['HOST'], port=rc['PORT'], db=rc['DB'])
    return cache



class EventList(ListAPIView):
    queryset = Event.objects.all()
    serializer_class = EventSerializer
    renderer_classes = (JSONRenderer, )


    def get(self, request, format=None):
        if IsAdminUser not in self.permission_classes:  # dont cache requests from admins


            # make a key that represents the request results you want to cache
            #  your requirements may vary
            key = get_key_from_request()

            #  I find it useful to hash the key, when query parms are added
            #  I also preface event cache key with a string, so I can clear the cache
            #   when events are changed
            key = "todaysevents" + hashlib.md5(force_bytes(key)).hexdigest()        

            # I dont want any cache issues (such as not being able to connect to redis)
            #  to affect my end users, so I protect this section
            try:
                cache = get_redis()
                data = cache.get(key)
                if not data:
                    #  not cached, so perform standard REST functions for this view
                    queryset = self.filter_queryset(self.get_queryset())
                    serializer = self.get_serializer(queryset, many=True)
                    data = serializer.data

                    #  cache the data as a string
                    cache.set(key, json.dumps(data))

                    # manage the expiration of the cache 
                    expire = 60 * 60 * 2  
                    cache.expire(key, expire)
                else:
                    # this is the place where you save all the time
                    #  just return the cached data 
                    data = json.loads(data)

                return Response(data)
            except Exception as e:
                logger.exception("Error accessing event cache\n %s" % (e))

        # for Admins or exceptions, BAU
        return super(EventList, self).get(request, format)

in my Event model updates, I clear any event caches. This hardly ever is performed (only Admins create events, and not that often), so I always clear all event caches

class Event(models.Model):

...

    def clear_cache(self):
        try:
            cache = get_redis()
            eventkey = "todaysevents"
            for key in cache.scan_iter("%s*" % eventkey):
                cache.delete(key)
        except Exception as e:
            pass


    def save(self, *args, **kwargs):
        self.clear_cache()
        return super(Event, self).save(*args, **kwargs)
查看更多
登录 后发表回答