I have my web app in S3 and serving the app using cloudFront's Web distribution. I gave the official documentation a read, but confused with lot of terminologies.
My questions:
- I want to set the cloudFront cache to maximum of 1 year(365 days). To do so, what do I have to do? (Do we have to set a header for each objects in S3?)
I came across the header cache-control, and found that if the cloudFront returns such a header with a value, then the browsers capable of caching will cache the objects for the given value.
How to set the cache-control header in cloudFront so that the objects are cached in user's browser?
Is there any tools to check the S3 & cloudFront deployment, namely the headers returned?
So, that it will be easy to debug with respect to the cache headers.
Update After @Udo answer. This is the screenshot of my request and response headers.
CloudFront does not add the
Cache-Control
header. It passes it through to the browser if the origin server supplies it.If you didn't set a
Cache-Control
header when you uploaded your objects to S3 then you will need to upload your objects again or go into the S3 console and add the header to the objects, with a value ofmax-age=31536000
if you want browsers to cache the object for up to a year.If you configure CloudFront to "use origin cache headers," then CloudFront will use the max-age value from
Cache-Control
to determine how long the object can be cached at CloudFront, unless thes-maxage
value is also there, in which case, CloudFront will use that instead.If you configure the min/max/default, CloudFront will use these counters to determine how long objects can be cached:
Cache-Control: max-age
has a lower value.Cache-Control: max-age
has a higher value.Cache-Control
is not seen on the object. You should not need this, because you should haveCache-Control
headers everywhere.Important things to note about these settings:
Also important, CloudFront has two geographically organized layers of caches -- regional (inner) and edge (outer). The edge caches are more numerous and geographically distributed, but the regional caches have larger storage capacity. If you fetch an object through CloudFront, CloudFront will cache that object somewhere (either at one regional cache or one edge cache or at one of each), but the next request -- perhaps from a browser in a different geographic area -- may pass through an edge and a region through which the object has never been requested before. On the other hand, it might be requested through an edge that doesn't have it, but it will be fetched from the regional cache. Try to keep this in mind as you understand what it means to say that any given object at any given time cannot correctly be said to be either in the cache or not in the cache because there is no "the" cache. There are multiple caches around the world, many of which do not communicate with each other because that would make things slower, not faster. If your web site is popular in Australia but not in England, there may be copies of your objects cached in Asia Pacific cache locations but not in Western Europe cache locations. This behavior is all automatic, and is not something you configure, but you need to be aware that CloudFront doesn't have a single, monolithic cache. Objects are cached in places where they are being accessed.
Your eyeballs are the best tool. The response headers in the browser tell you what you need to know:
Age:
is how long ago (in seconds) CloudFront has had this object in its cache.X-Cache: Hit from cloudfront
means CloudFront did not have to fetch the object from S3, because it was already cached.Miss from cloudfront
means CloudFront did not have the object in its cache at the edge handling this request, and needed to fetch it from S3.The command line utility
curl
, along with its-v
option is also useful for observing web headers.