Are "forever" and "never" the only useful caching duration?

The presentation "Cache is King" from Souders by Steve ( AT around 14:30 ), it is understood that in practice there are only two cache duration that you should use for their resources, "always" and "never" (my own terminology).

  • "Forever" means that you effectively make the resource permanently unchanged by setting the magic age to a very high age, such as one year. If you want to change the resource at some point, the presentation assumes that you are just posting the changed resource to a different URL. (It is assumed that this rename is necessary in part or in whole due to the large number of misconfigured proxies on the Internet.)
  • "Never" means that you effectively disable all forms of caching and require browsers to load the resource every time it is requested.

On the one hand, any performance recommendations provided by a chief performance engineer at Google carry weight on their own. On the other hand, HTTP caching seems to have been designed with variable cache duration for a reason (not just "forever" and "never"), and changing the url to a resource just because the resource was changed seems to be contrary to the spirit of HTTP.

Are "forever" and "never" the only long lasting caches you should be using in practice? Is this contrary to other best practices on the internet?

In addition to the typical "user with browser" use case, I would also like to know how these principles apply to REST / hypermedia APIs.

+3


source to share


4 answers


Many people will disagree with limiting themselves "forever" or "never" as you describe it.

First, it ignores the option to enable persistent check caching. In this case, if the client (or proxy) has cached the resource, it sends a conditional HTTP request. If the client / proxy has cached the latest version of the resource, the server sends a 304 short response rather than the entire resource. If the client (proxy) copy is out of date, then the server sends the entire resource.

With this scheme, the client will always receive an updated version of the resource, and if the resource does not change, there will be more bandwidth.



To save even more bandwidth, the client may be prompted to repeat the check only when the resource is older than a certain period of time.

And if problems with the proxy are a problem, the server can indicate that only clients, not proxies, can cache the resource.

I found this document describes your caching options rather succinctly. This page is longer but also provides excellent information.

+4


source


"It depends on what you're trying to accomplish and your branding proposal."

If all you want to achieve is bandwidth savings, you can do an overall cost breakdown. Maintenance costs may not be significant. Browsers are pretty smart anyway for optimizing image accesses, for example, so understand your HTTP protocol. Forever, combined with versions of url rules and url rewrite rules might be a good fit, as a Google engineer suggested.

Resource volatility is different. For example, if you use daily stock tables, they can be safely cached for some time, but not forever.



Is your calculation hard? Are your users responding to timeliness? Is the data live or fixed? For example, you could serve airline routes, hurricane path, Greek versions, or a BI report for a COO. You might want to cache it, but the TTL will most likely vary by user class, to the point that it never will. Forever may not work for live data, but it can never be the wrong answer.

The degree of collaboration between server and client can be another factor. For example, in a business operations environment where procedures can be distributed and expected to be followed, it might be worth looking at TTL again.

NTN. I doubt there is a magic answer.

+2


source


Idealism, you should cache until the content changes, if you cannot clear / refresh the cache when the content changes for whatever reason, you need a duration. But really, if you can, cache forever, or don't cache. No need to update if you don't already know anything.

0


source


If you know the underlying data will be static for any period of time, caching makes sense. We have a web service that provides data from a database that is populated with an overnight ETL job from an external source. Our RESTful web service is only sent to the database when it changes. In our case, we know exactly when the data changes and we invalidate the cache immediately after the ETL process is complete.

0


source







All Articles