Web Caching

From 118wiki

Jump to: navigation, search

Web Caching

Users prefer online experiences that minimize latency. There are numerous ways to reduce latency: use higher bandwidth; compress the content being transferred; keep a copy of the content in case it is requested again. The last idea is called caching. In computer and system architecture, cache is a standard topic (related to the algorithmic technique of memoization); web cache specializes this topic.

Most browsers use a local cache, in memory or on disk, to improve performance. The idea of using a cache can also be implemented for a local area network or larger areas, by implementing a shared cache. Effective use of a cache depends on replacement strategies (see cache algorithms), which can use heuristic rules like LRU, FIFO, least frequently used (LFU), freshness or expiration dates, largest-file first (LFF) --- all strategies to determine what things in the cache can be discarded in favor of new content to be cached.

HTTP uses, in the headers, some cache-control directives in both request and response directions. Keywords in the headers of client requests include the following

Cache-control: no-cache
Cache-control: no-store
Cache-control: max-age=..
Cache-control: no-transform
Cache-control: only-if-cached

Keywords in headers of server responses include

Cache-control: public
Cache-control: private
Cache-control: no-cache
Cache-control: no-store
Cache-control: no-transform
Cache-control: must-revalidate
Cache-control: proxy-revalidate
Cache-control: max-age=..
(and more)

Cache control within a browser is a simpler task than implementing cache control elsewhere, such as within an HTTP proxy. One could find HTTP proxies within organizations (often associated with a security firewall), within an ISP, within a Content Delivery Network, and even colocated with the site hosting the HTTP server. All of these proxies can host caches to improve performance.

The effectiveness of cache can be measured by how well it reduces latency of request-responses (round trip), and measuring the hit ratio of the cache. In cases where these measurements can't directly be obtained, we may be able to simulate how well a cache would work from measured traffic data. Such simulations are also helpful to estimate the effect of different replacement strategies, estimating the improvement of increasing cache size, and other questions.

Personal tools