`
`Brian D. Davison
`Department of Computer Science
`Rutgers, The State University of New Jersey (USA)
`http://www.cs.rutgers.edu/˜davison/
`davison@cs.rutgers.edu
`
`c(cid:13)IEEE. Reprinted from IEEE Internet Computing, Volume 5, Number 4, July/August
`2001, pages 38-45.
`
`This material is posted here with permission of the IEEE. Internal or personal use
`of this material is permitted. However, permission to reprint/republish this mate-
`rial for advertising or promotional purposes or for creating new collective works
`for resale or redistribution must be obtained from the IEEE by sending an email
`message to pubs-permissions@ieee.org.
`
`APPLE 1056
`Apple v. SpaceTime3D, Inc.
`IPR2023-00242
`
`1
`
`
`
`A Web Caching
`Primer
`
`Scalable Internet Services
`
`Brian D. Davison
`Rutgers,The State University
`of New Jersey
`
`Now a signi(cid:222)cant part of the Web(cid:213)s infrastructure, Web
`
`resource caching can reduce network latencies and
`
`bandwidth demands transparently.
`
`When the Web was new, a sin-
`
`gle entity could (and did) list
`and index all of the Web
`pages available, and searching was just
`an application of the Unix egrep com-
`mand over an index of 110,000 docu-
`ments.1 Today, even though the larger
`search engines index billions of docu-
`ments, any one engine is likely to see
`only a fraction of the content available.2
`Moreover, with the widespread commer-
`cialization of the Web, exceeding the
`(cid:210)eight-second rule(cid:211) for downloading a
`Web page can mean a signi(cid:222)cant loss of
`revenue as many users will move on to a
`new site if they are unsatisfied with the
`performance of the current one.3 Finally,
`as increased Web use necessitates larger
`and more expensive connections to the
`Internet, concern for ef(cid:222)cient use of those
`connections similarly increases.
`This article provides a primer on Web
`resource caching, one technology used to
`make the Web scalable. Web caching can
`reduce bandwidth usage, decrease user-
`perceived latencies, and reduce Web serv-
`er loads transparently. As a result,
`caching has become a signi(cid:222)cant part of
`the Web(cid:213)s infrastructure. Caching has
`
`even spawned a new industry: content
`delivery networks, which are also grow-
`ing at a fantastic rate.
`relatively
`Readers
`familiar with
`advanced Web caching topics such as the
`Internet Cache Protocol (ICP),4 invalida-
`tion, and interception proxies are not
`likely to learn much here. Instead, this
`article is designed for the general audi-
`ence of Web users. Rather than a how-to
`guide to caching technology deployment,
`it is a high-level argument for the value
`of Web caching to content consumers and
`producers. The article defines caching,
`explains how it applies to the Web, and
`describes when and why it is useful.
`Though I provide several topical refer-
`ences, readers interested in survey papers
`should look elsewhere (see the sidebar,
`(cid:210)Web Caching Resources(cid:211) on page 43).
`
`Caching in Memory
`Systems
`Memory architectures use caching to
`improve computer performance.5 Because
`central processing units operate at very
`high speeds while memory systems oper-
`ate at a slower rate, CPU designers pro-
`vide one or more levels of cache (cid:209) a
`
`38
`
`JULY ¥ AUGUST 2001
`
`http://computer.org/internet/
`
`1089-7801/ 01/$10.00 '2001 IEEE
`
`IEEE INTERNET COMPUTING
`
`2
`
`
`
`small amount of memory that operates at, or close
`to, the speed of the CPU. When the CPU (cid:222)nds the
`information it needs in the cache, a hit, it doesn(cid:213)t
`have to slow down. When it fails to find the
`requested object in the cache, a miss, it must fetch
`the object directly and incur the associated per-
`formance cost.
`Typically, when a cache miss occurs, the CPU
`places the fetched object in the cache, assuming
`temporal locality (cid:209) that a recently requested
`object is more likely than others to be requested
`in the future. Memory systems also typically
`retrieve multiple consecutive memory addresses
`and place them in the cache in a single operation,
`assuming spatial locality (cid:209) that nearby objects
`are more likely to be requested during a certain
`time span.
`At some point the cache will become full and
`the system will use a replacement algorithm to
`make room for new objects, for example, first-
`in/first-out (FIFO), least recently used (LRU), or
`least frequently used (LFU). The goal is to optimize
`cache performance (for example, to maximize the
`likelihood of a cache hit for typical memory archi-
`tectures).
`
`Mechanics of a Web Request
`In its simplest form, the Web is a set of servers and
`clients (such as Web browsers, or any other soft-
`ware used to make a request of a Web server). To
`retrieve a particular Web resource, the client
`attempts to communicate over the Internet to the
`origin Web server, as depicted in Figure 1. To con-
`nect to the server, the client needs the host(cid:213)s
`numerical identifier. It queries the domain name
`system (DNS) to translate the hostname (for exam-
`ple, www.web-caching.com) to its Internet Proto-
`col (IP) address (209.182.1.122), with which it can
`establish a connection to the server and request
`the content. Once the Web server has received and
`examined the client(cid:213)s request, it can generate and
`transmit the response. As Figure 2 shows, each
`step in this process takes time.
`The hypertext transfer protocol (HTTP) speci(cid:222)es
`the interaction among Web clients, servers, and
`intermediaries. Requests and responses are encod-
`ed as headers that precede optional bodies con-
`taining content. Figure 3 (next page) shows one
`set of request and response headers. The first
`request header shows the method used (GET), the
`resource requested ((cid:210)/(cid:211)), and the version of HTTP
`supported (1.1). Another commonly used method
`is POST, which allows clients to send content with
`a request (for instance, to carry variables from an
`
`Web Caching
`
`Internet
`
`Browser
`
`Origin Web server
`
`Figure 1. A simplistic view of the Web. At its most basic, the Web
`consists of a set of servers and clients and the infrastructure that
`connects them.
`
`Client
`Open
`
`Server
`
`HTTP
`request
`
`HTTP
`response
`
`Establish
`connection
`
`Generate
`response
`
`Time
`
`1 Round-trip
`time (RTT)
`
`1 RTT +
`Transmit
`(request)
`
`Transmit
`(response)
`
`Response
`received
`
`Figure 2. HTTP transfer timing costs with a new
`connection.The amount of time to retrieve a
`resource when a new connection is required can
`be approximated by two round-trip times plus the
`time to transmit the response (plus DNS resolu-
`tion delays, if necessary).
`
`HTML form). The (cid:222)rst line of the response header
`shows the HTTP version supported and a response
`code with standard values.
`The headers of an HTTP transaction also speci-
`fy aspects relevant to an object(cid:213)s cacheability. The
`relevant headers from the example in Figure 3
`include (cid:2)(cid:3)(cid:4)(cid:5), (cid:6)(cid:3)(cid:7)(cid:4)(cid:8)(cid:9)(cid:10)(cid:11)(cid:12)(cid:13)(cid:12)(cid:5)(cid:11), (cid:14)(cid:15)(cid:3)(cid:16), (cid:17)(cid:3)(cid:18)(cid:19)(cid:5)(cid:8)(cid:17)(cid:10)(cid:20)(cid:8)
`(cid:4)(cid:21)(cid:10)(cid:22), and (cid:14)(cid:23)(cid:24)(cid:12)(cid:21)(cid:5)(cid:7). For example, in HTTP GET
`requests that include an (cid:25)(cid:13)(cid:8)(cid:9)(cid:10)(cid:11)(cid:12)(cid:26)(cid:5)(cid:11)(cid:8)(cid:27)(cid:12)(cid:20)(cid:18)(cid:5) head-
`er, Web servers use the (cid:6)(cid:3)(cid:7)(cid:4)(cid:8)(cid:9)(cid:10)(cid:11)(cid:12)(cid:26)(cid:5)(cid:11) date on the
`current content to return the object only if the
`object changed after the date of the cached copy.
`The origin server needs an accurate clock to cal-
`culate and present modification and expiration
`times in the other tags.
`An (cid:14)(cid:15)(cid:3)(cid:16) (entity tag) represents a signature for
`the object and allows for a stronger test than (cid:25)(cid:13)(cid:8)
`(cid:9)(cid:10)(cid:11)(cid:12)(cid:13)(cid:12)(cid:5)(cid:11)(cid:8)(cid:27)(cid:12)(cid:20)(cid:18)(cid:5)(cid:28) If the signature of the current
`object at this URL matches the signature of the
`
`IEEE INTERNET COMPUTING
`
`http://computer.org/internet/
`
`JULY ¥ AUGUST 2001
`
`39
`
`3
`
`
`
`Scalable Internet Services
`
`(cid:30)(cid:5)(cid:31) (cid:5)(cid:7)(cid:4)(cid:1)!(cid:5)(cid:3)(cid:11)(cid:5)(cid:21)(cid:28)
`"(cid:14)(cid:15)(cid:1)#(cid:1)!(cid:15)(cid:15)$#%&%(cid:1)
`!(cid:10)(cid:7)(cid:4)(cid:28)(cid:1)’’’&’(cid:5)((cid:8)(cid:18)(cid:3)(cid:18)(cid:19)(cid:12)(cid:20)(cid:16)&(cid:18)(cid:10)(cid:29)(cid:1)
`(cid:30)(cid:5)(cid:13)(cid:5)(cid:21)(cid:5)(cid:21)(cid:28)(cid:1)(cid:19)(cid:4)(cid:4)(cid:24)(cid:28)##)(cid:3)(cid:20)(cid:18)(cid:10) )(cid:5)(cid:21)(cid:8)’(cid:5)((cid:24)(cid:3)(cid:16)(cid:5)(cid:7)&(cid:18)(cid:10)(cid:29)#(cid:17)(cid:3)(cid:18)(cid:19)(cid:5)*(cid:10)’#(cid:1)
`+(cid:7)(cid:5)(cid:21)(cid:8),(cid:16)(cid:5)(cid:20)(cid:4)(cid:28)(cid:1) (cid:9)(cid:10)-(cid:12)(cid:22)(cid:22)(cid:3)#.&/.(cid:1) 0(cid:5)(cid:20)1(cid:1) 23%%4(cid:1) (cid:25)4(cid:1) (cid:27) (cid:20)5(cid:27)
`6&6&%(cid:1)(cid:7) (cid:20). 7(cid:1)
`,(cid:18)(cid:18)(cid:5)(cid:24)(cid:4)(cid:28)(cid:1)8#8(cid:1)
`(cid:17)(cid:10)(cid:20)(cid:20)(cid:5)(cid:18)(cid:4)(cid:12)(cid:10)(cid:20)(cid:28)(cid:1)(cid:18)(cid:22)(cid:10)(cid:7)(cid:5)
`
`(cid:30)(cid:5)(cid:7)(cid:24)(cid:10)(cid:20)(cid:7)(cid:5)(cid:1)!(cid:5)(cid:3)(cid:11)(cid:5)(cid:21)(cid:28)
`!(cid:15)(cid:15)$#%&%(cid:1)9//(cid:1)5:(cid:1)
`(cid:2)(cid:3)(cid:4)(cid:5)(cid:28)(cid:1)(cid:9)(cid:10)(cid:20);(cid:1)%<(cid:1)(cid:2)(cid:5)(cid:18)(cid:1)9///(cid:1)9%(cid:28)96(cid:28)9=(cid:1)"(cid:9)(cid:15)(cid:1)
`(cid:27)(cid:5)(cid:21))(cid:5)(cid:21)(cid:28)(cid:1) ,(cid:24)(cid:3)(cid:18)(cid:19)(cid:5)#%&=&%9(cid:1) 2+(cid:20)(cid:12)(cid:23)7(cid:1) (cid:29)(cid:10)(cid:11)>(cid:24)(cid:5)(cid:21)(cid:22)#%&%<
`$!$#.&/?9
`(cid:17)(cid:3)(cid:18)(cid:19)(cid:5)(cid:8)(cid:17)(cid:10)(cid:20)(cid:4)(cid:21)(cid:10)(cid:22)(cid:28)(cid:1)(cid:29)(cid:3)(cid:23)(cid:8)(cid:3)(cid:16)(cid:5)@<A.//(cid:1)
`(cid:14)(cid:23)(cid:24)(cid:12)(cid:21)(cid:5)(cid:7)(cid:28)(cid:1)(cid:15) (cid:5);(cid:1)%B(cid:1)(cid:2)(cid:5)(cid:18)(cid:1)9///(cid:1)9%(cid:28)96(cid:28)9=(cid:1)"(cid:9)(cid:15)(cid:1)
`(cid:6)(cid:3)(cid:7)(cid:4)(cid:8)(cid:9)(cid:10)(cid:11)(cid:12)(cid:26)(cid:5)(cid:11)(cid:28)(cid:1)(cid:9)(cid:10)(cid:20);(cid:1)%<(cid:1)(cid:2)(cid:5)(cid:18)(cid:1)9///(cid:1)%.(cid:28)6.(cid:28)9%(cid:1)"(cid:9)(cid:15)(cid:1)
`(cid:14)(cid:15)(cid:3)(cid:16)(cid:28)(cid:1)C<=<(9(cid:8).%.D(cid:8)=(cid:3)=(cid:5)96%(cid:11)E(cid:1)
`,(cid:18)(cid:18)(cid:5)(cid:24)(cid:4)(cid:8)(cid:30)(cid:3)(cid:20)(cid:16)(cid:5)(cid:7)(cid:28)(cid:1)(F(cid:4)(cid:5)(cid:7)(cid:1)
`(cid:17)(cid:10)(cid:20)(cid:4)(cid:5)(cid:20)(cid:4)(cid:8)(cid:6)(cid:5)(cid:20)(cid:16)(cid:4)(cid:19)(cid:28)(cid:1)%AD%%(cid:1)
`(cid:17)(cid:10)(cid:20)(cid:20)(cid:5)(cid:18)(cid:4)(cid:12)(cid:10)(cid:20)(cid:28)(cid:1)(cid:18)(cid:22)(cid:10)(cid:7)(cid:5)(cid:1)
`(cid:17)(cid:10)(cid:20)(cid:4)(cid:5)(cid:20)(cid:4)(cid:8)(cid:15)F(cid:24)(cid:5)(cid:28)(cid:1)(cid:4)(cid:5)(cid:23)(cid:4)#(cid:19)(cid:4)(cid:29)(cid:22)
`
`Figure 3. Sample HTTP request and response headers. Headers
`identify client and server capabilities as well as describe the
`response content.
`
`cached one, the objects are considered equivalent.
`The (cid:14)(cid:23)(cid:24)(cid:12)(cid:21)(cid:5)(cid:7) and (cid:17)(cid:3)(cid:18)(cid:19)(cid:5)(cid:8)(cid:17)(cid:10)(cid:20)(cid:4)(cid:21)(cid:10)(cid:22)(cid:28)(cid:1) (cid:29)(cid:3)(cid:23)(cid:8)(cid:3)(cid:16)(cid:5)
`headers specify how long the object can be con-
`sidered valid. For slowly or never-changing
`resources, an explicit expiration date tells caches
`how long they can keep the object (without
`requiring the cache to contact the origin server to
`validate it).
`
`Caching Web Resources
`Web caching is similar to memory system caching
`(cid:209) a Web cache stores Web resources in anticipa-
`tion of future requests. However, significant dif-
`ferences between memory system and Web
`caching result from the nonuniformity of Web
`object sizes, retrieval costs, and cacheability.
`To address object size, cache operators and
`designers track both the overall object hit rate (per-
`centage of requests served from cache) and the
`overall byte hit rate (percentage of bytes served
`from cache). Traditional replacement algorithms
`often assume a (cid:222)xed object size, so variable sizes
`can affect their performance. Retrieval cost varies
`with object size, distance traveled, network con-
`gestion, and server load. Finally, some Web
`resources cannot or should not be cached, for
`example, because the resource is personalized to a
`particular client or is constantly updated.
`Caching is performed in various locations
`throughout the Web, including at the two end-
`points known to a typical user (cid:209) the Web brows-
`er and Web server. Figure 4 shows a possible
`
`chain of caches through which a request and
`response might flow. Proxy caches, intermediary
`caches between the client machine and the origin
`server, will generate new requests on behalf of
`users if they cannot satisfy the requests them-
`selves. If a response captured in a browser cache
`does not satisfy a user, the request might be
`passed to a department- or organization-wide
`proxy cache. If a valid response is not present
`there, a proxy cache operated by the client(cid:213)s ISP
`might receive the request. If the ISP cache does
`not contain the requested response, it will likely
`attempt to contact the origin server. However,
`reverse proxy caches operated by the content
`provider(cid:213)s ISP or CDN might instead respond to
`the request. If they do not have the requested
`information, the request might ultimately arrive
`at the origin server. Even at the origin server,
`content, or portions of content, can be stored in
`a server-side cache to reduce the server load (for
`instance, by reducing the need for redundant
`computations or database retrievals).
`The response flows through the reverse path
`back to the client. Each step in Figure 4 has mul-
`tiple arrows, signifying relationships with mul-
`tiple entities at each level. For example, the
`reverse proxy (sometimes called an HTTP accel-
`erator), operated by the content provider(cid:213)s ISP or
`CDN, can serve as a proxy cache for the content
`from multiple origin servers, and can receive
`requests from multiple downstream clients
`(including forward caches operated by others, as
`shown in Figure 4).
`In general, a cache need not talk only to the
`clients below it and the server above it. In fact, to
`scale to large numbers of clients, multiple caches
`might be necessary. In a hierarchical caching
`structure, each cache serves many clients, which
`can be users or other caches.6 When a local cache
`cannot serve a request, it passes the request to a
`higher level in the hierarchy. If the request misses
`at a root cache (which has no parent), the cache
`requests the object from the origin server.
`Figure 5 (see page 6) shows an alternative
`cooperative caching architecture in which caches
`communicate with peers using an intercache pro-
`tocol such as ICP. In this form, on a miss, a cache
`asks a predetermined set of peers whether they
`have the missing object. If they do, the cache
`routes the request to the first responding peer
`cache. Otherwise, the cache attempts to retrieve
`the object directly from the origin server. This
`approach can prevent storage of multiple copies
`and reduce origin server retrievals, but exacts a
`
`40
`
`JULY ¥ AUGUST 2001
`
`http://computer.org/internet/
`
`IEEE INTERNET COMPUTING
`
`4
`
`
`
`penalty in the form of increased intercache com-
`munication. In another variation, peers periodi-
`cally receive a summary of the contents of each
`cache, which can signi(cid:222)cantly reduce communi-
`cation overhead.7
`Variations and combinations of these approach-
`es have been proposed. Current thinking limits
`cooperative caching to smaller client populations,
`but both cooperative caching and combinations
`are used in real-world sites, particularly where net-
`work access is expensive.
`A client may or may not know about interme-
`diate proxy caches. If a client is configured to
`use a proxy cache directly, it sends all requests
`not satisfied by its built-in cache to the proxy.
`Otherwise, the client has to look up the IP
`address of the origin host. If the content provider
`is using a CDN, the DNS servers may be cus-
`tomized to return the IP address of the server (or
`proxy cache) closest to the client (where (cid:210)clos-
`est(cid:211) likely reflects network distance and addi-
`tional information to avoid overloaded servers
`or networks). In this way, a reverse proxy server
`can operate as if it were the origin server, imme-
`diately answering any cached requests, and for-
`warding the rest.
`Even when the client has the IP address of the
`origin server it should contact, it might never
`reach it. Along the network path between the
`client and the server, there might be a network
`switch or router that directs all Web requests
`transparently to an interception proxy cache. This
`approach can be used at any of the proxy cache
`locations shown in Figure 4. In this scenario, the
`client believes it has contacted the origin server,
`but instead the interception proxy serves the con-
`tent either from cache, or by (cid:222)rst fetching it from
`the origin server.
`
`Bene(cid:222)ts of Web Caching
`Web caching works because of popularity (cid:209) the
`more popular a resource is, the more likely it is to
`be requested in the future. In one study spanning
`more than a month, out of all the objects request-
`ed by individual users, on average close to 60 per-
`cent of those objects were requested more than
`once by the same user.8 Likewise, much content is
`of value to more than one user. In fact, of the hits
`recorded in another caching study, up to 85 per-
`cent were the result of multiple users requesting
`the same object.9
`Three features of Web caching make it attrac-
`tive to all Web participants, including end users,
`network managers, and content creators. Caching
`
`Web Caching
`
`Origin server plus
`server-side cache
`
`Server’s ISP or CDN
`reverse proxy cache
`
`Client’s ISP
`forward proxy cache
`
`Organization
`forward proxy cache
`
`Browser plus cache
`
`Figure 4. Caches in the World Wide Web. Starting
`in a browser, a Web request can travel through
`multiple caching systems on its way to the origin
`server. At any point in the sequence a response
`can be served if the request matches a valid
`response in the cache.
`
`(cid:2) reduces network bandwidth usage, which can
`save money for both content consumers and
`creators;
`(cid:2) lessens user-perceived delays, which increases
`user-perceived value; and
`(cid:2) lightens loads on the origin servers, saving
`hardware and support costs for content
`providers and providing consumers a shorter
`response time for noncached resources.
`
`When a request is satis(cid:222)ed by a cache, the content
`no longer has to travel across the Internet from the
`origin Web server to the cache, saving bandwidth
`
`IEEE INTERNET COMPUTING
`
`http://computer.org/internet/
`
`JULY ¥ AUGUST 2001
`
`41
`
`5
`
`
`
`Scalable Internet Services
`
`Internet
`
`Forward
`proxy
`caches
`
`Browsers
`
`Figure 5. Cooperative caching. Caches communicate
`with peers before making requests over the Web.
`
`for the cache owner as well as the origin server. TCP,
`the network protocol used by HTTP, has a fairly
`high overhead for connection establishment and
`sends data slowly at (cid:222)rst. This, combined with the
`fact that most requests on the Web are for relative-
`ly small resources, means that reducing the number
`of necessary connections and holding them open
`(making them persistent) so that future requests can
`use the improves client performance.
`Specifically, a client of a forward proxy can
`save time because it can retain a persistent con-
`nection to the proxy instead of establishing new
`connections with each origin server a user visits
`during a session. Persistent connections are par-
`ticularly bene(cid:222)cial to clients suffering from high-
`latency network service (for example, clients con-
`nected to the Internet via dial-up modems).
`Furthermore, busy proxies can use persistent
`connections to send requests for multiple clients
`to the same server, reducing connection establish-
`ment times to servers as well. Therefore, a proxy
`cache supporting persistent connections can cache
`connections on both client and server sides (avoid-
`ing the initial round-trip time for a new HTTP con-
`nection shown in Figure 2).
`
`Potential Problems
`There are a number of potential problems associ-
`ated with Web caching. Most significant from the
`perspective of both the content consumer and the
`content provider is the possibility of the end user
`seeing stale (that is, old or out-of-date) content,
`compared to fresh content available on the ori-
`gin server. HTTP does not ensure strong consis-
`tency and thus there is a real potential for data to
`
`be cached too long. The likelihood of this is a
`trade-off that the content provider can explicitly
`manage.
`Second, caching tends to improve the latency
`only for cached responses that are subsequently
`requested (that is, hits). Misses that are processed
`by a cache generally have decreased speed, as
`each system through which the transaction pass-
`es will increase the latency experienced by a small
`amount. Thus, a cache only benefits requests for
`content already stored in it. Caching is also lim-
`ited by the frequency with which popular Web
`resources change, and, importantly, the fact that
`many resources will be requested only once.
`Finally, some responses cannot or should not be
`cached.
`
`Content Cacheability
`Not every Web resource is cacheable. Of those that
`are, some can be cached for long periods by any
`cache, while others have restrictions such as
`caching for short periods or to certain kinds of
`caches (for instance, nonproxy caches). This (cid:223)ex-
`ibility maximizes the opportunity for caching indi-
`vidual resources. The cacheability of a Web site
`affects both its user-perceived performance and
`the scalability of a particular hosting solution.
`Instead of taking seconds or minutes to load, a
`cached object can appear almost instantaneously.
`Regardless of how much the hosting costs, a
`cache-friendly design will allow a server to serve
`more pages before it needs to upgrade to a more
`expensive solution.
`Fortunately, the content provider determines its
`resources(cid:213) cacheability. The Web server software
`sets and sends the HTTP headers that determine
`cacheability, according to the server(cid:213)s caching pol-
`icy for that data. To maximize a Web site(cid:213)s
`cacheability, all static content (buttons, graphics,
`audio and video (cid:222)les, and pages that rarely change)
`are typically given expiration dates far in the future
`so that they can be cached for weeks or months at
`a time. (Note that HTML (cid:29)(cid:5)(cid:4)(cid:3) tags cannot validly
`specify caching properties and are ignored by most
`proxy caching products since proxies do not exam-
`ine the contents of an object (cid:209) that is, they do not
`see the HTML source of a Web page.)
`By setting an expiration date far into the
`future, the content provider trades the potential
`of caching stale data for reduced bandwidth usage
`and improved user-perceived response time. A
`shorter expiration date reduces the chance that the
`user sees stale content, but increases the number
`of times that caches will need to validate the
`
`42
`
`JULY ¥ AUGUST 2001
`
`http://computer.org/internet/
`
`IEEE INTERNET COMPUTING
`
`6
`
`
`
`resource. Currently, most caches use a client
`polling approach that favors revalidation for
`objects that have recently changed.10 Because this
`can generate signi(cid:222)cant overhead (especially for
`popular content), a better approach might be for
`the origin server to invalidate the cached
`objects,11 but only recently has this approach been
`considered for nonproprietary networks. The Web
`cache invalidation protocol (WCIP),12 currently in
`development, lets Web caches subscribe to inval-
`idation channels corresponding to content in
`which they are interested. This protocol is intend-
`ed to allow large numbers of frequently changing
`Web objects to be cached and distributed with
`freshness guarantees.
`Dynamically generated objects are typically
`considered uncacheable, although they are not
`necessarily so. Examples of dynamically generat-
`ed objects include fast-changing content like stock
`quotes, personalized pages, query results (such as
`from search engines), and e-commerce shopping
`carts. Rarely would it be desirable for any of these
`objects to be cached at an intermediate proxy,
`although some might be cached in the client
`browser cache (such as personalized resources that
`don(cid:213)t change often).
`However, dynamically generated objects con-
`stitute an increasing fraction of the Web. One way
`to allow for the caching of dynamic content is to
`cache programs (such as applets) that generate or
`modify the content.13 Another is to enable the
`server to cache portions of documents to optimize
`server-side operations (for example, server-side
`products
`from SpiderCache, http://www.
`spidercache.com/; Persistence, http://www.
`persistence.com/; and XCache, http://www.xcache.
`com/). Since most dynamic pages include much
`static content, sending the differences between
`pages or between versions of a page,14,15 or break-
`ing documents into separately cacheable pieces
`and reassembling them at the client16 are possible
`solutions. Akamai(cid:213)s EdgeSuite (http://www.akamai.
`com/) and more generally the new open protocol
`called Edge Side Includes (http://www.edge-deliv-
`ery.org/) are essentially examples of this last solu-
`tion, except that the Web pages are assembled at
`edge servers rather than at the client.
`
`Cache Latency
`Caching can provide only a limited benefit
`(object hit rates typically reach 40 percent to 50
`percent with sufficient traffic), as a cache can
`only provide objects that have been previously
`requested. If future requests can be anticipated,
`
`Web Caching
`
`Web Caching Resources
`
`Web Sites
`IETF Working Group on Web Replication and Caching ¥
`http://www.wrec.org/
`Information Resource Caching FAQ ¥ http://www.ircache.net/FAQ/
`Squid: Open Source Proxy Cache Software ¥
`http://www.squid-cache.org/
`Standards work on Web Cache Invalidation Protocol (WCIP) ¥
`http://www.content-signaling.org/
`Web Caching and Content Delivery Resources (news, tutorials, tips,
`tools, discussions, and links) ¥ http://www.web-caching.com/
`W3C HTTP Protocol Page ¥ http://www.w3.org/Protocols/
`W3C HTTP Speci(cid:222)cations and Drafts ¥
`http://www.w3.org/Protocols/Specs.html
`W3C(cid:213)s Propagation, Caching, and Replication on the Web ¥
`http://www.w3.org/Propagation/
`
`Survey Articles
`G. Barish and K. Obraczka,(cid:210)World Wide Web Caching:Trends and
`Techniques,(cid:211) IEEE Comm. Internet Technology Series, vol. 38, no. 5, May
`2000, pp. 178—184.
`J.C. Mogul,(cid:210)Squeezing More Bits out of HTTP Caches,(cid:211) IEEE Network,
`vol. 14, no. 3, May/June 2000, pp. 6-14.
`J.Wang,(cid:210)A Survey of Web Caching Schemes for the Internet,(cid:211) ACM
`Computer Comm. Rev., vol. 29, no. 5, Oct. 1999, pp. 36-46.
`
`Books
`B. Krishnamurthy and J. Rexford, Web Protocols and Practice: HTTP 1.1,
`Networking Protocols, Caching, and Traf(cid:222)c Measurement, Addison
`Wesley Longman, Reading, Mass., 2001.
`D.Wessels, Web Caching, O(cid:213)Reilly & Associates, Sebastopol, Calif., 2001.
`
`objects can be obtained in advance. Once avail-
`able in a local cache, those objects can be
`retrieved with minimal delay, enhancing the user
`experience.
`Although prefetching shows promise, it is dif(cid:222)-
`cult to evaluate and has not been widely imple-
`mented in commercial systems. Some browser
`add-ons and workgroup proxy caches will prefetch
`the links of the current page, or periodically
`prefetch the pages in a user(cid:213)s bookmarks. One sig-
`nificant difficulty is accurately predicting which
`resources will be needed next to minimize mis-
`takes that result in wasted bandwidth and
`increased server loads. Another concern is deter-
`mining what content can be prefetched safely, as
`some Web requests have potentially undesirable
`side effects, such as adding items to an online
`shopping cart.17
`Recently, Cohen and Kaplan proposed tech-
`niques that do everything but prefetch.18 In par-
`ticular, they demonstrated the usefulness of
`
`IEEE INTERNET COMPUTING
`
`http://computer.org/internet/
`
`JULY ¥ AUGUST 2001
`
`43
`
`7
`
`
`
`Scalable Internet Services
`
`The Need for Web Caching
`
`Caching helps bridge the performance gap between local activity and
`remote content. In the short term, caching helps to improve Web perfor-
`mance by reducing the cost and end-user latency for Web access. In the
`long term, even as bandwidth costs continue to drop and higher end-user
`speeds become available, caching will continue to reap bene(cid:222)ts for the fol-
`lowing reasons:
`
`(cid:2) Bandwidth will always have some cost.The cost of bandwidth will never
`reach zero, even though increased competition, a growing market, and
`economies of scale reduce end-user costs.The cost of bandwidth at
`the core has stayed relatively stable, requiring ISPs to implement
`methods such as caching to stay competitive and reduce core
`bandwidth usage so that edge bandwidth costs can be low.
`(cid:2) Nonuniform bandwidth and latencies will persist. Because of physical
`limitations such as environment and location as well as financial
`constraints, there will always be variations in bandwidth and latencies.
`Caching can help to smooth these effects.
`(cid:2) Network distances are increasing. Firewalls, other proxies for security
`and privacy, and virtual private networks for telecommuters increase
`the number of hops through which content must travel and slow Web
`response times.
`(cid:2) Bandwidth demands continue to increase. Growth in the user base, in
`popularity of high-bandwidth media, and in user expectations of faster
`performance guarantee that demand for bandwidth will not end.
`(cid:2) Hot spots in the Web will continue. Intelligent load balancing can alleviate
`problems when high user demand for a site is predictable, but a Web
`site(cid:213)s popularity can also come as a result of current events, desirable
`content, or word of mouth. Distributed Web caching can help alleviate
`these (cid:210)hot spots(cid:211) resulting from (cid:223)ash traf(cid:222)c loads.
`(cid:2) Communication costs exceed computational costs. Communication is likely
`to always be more expensive (to some extent) than computation.We
`use memory caches because CPUs are much faster than main
`memory. Likewise, we will continue to use caches as computer systems
`and network connectivity both get faster.
`
`(cid:2) preresolving (performing DNS lookup in
`advance),
`(cid:2) preconnecting (opening a TCP connection in
`advance), and
`(cid:2) prewarming (sending a dummy HTTP request
`to the origin server).
`
`These techniques can be implemented in both
`proxies and browsers, and could significantly
`reduce latencies without prefetching content.
`
`Conclusion
`Given the choices of caching products on the
`market today, how do you select one? The
`process involves determining the features (man-
`ageability, failure tolerance, scalability, and so
`on), and performance level (such as requests per
`
`second, bandwidth saved, and average latency,
`often measured by cache benchmarking services)
`you require. Likewise, as caches and content
`delivery services replace origin servers in serv-
`ing Web traffic, content developers should con-
`sider how to maximize the usefulness of these
`technologies.
`
`Acknowledgments
`Thanks are due to Haym Hirsh, Vincenzo Liberatore, and
`
`anonymous reviewers for comments that greatly improved this
`
`tutorial. This work has been supported in part by the U.S.
`
`National Science Foundation under grant ANI 9903052.
`
`References
`
`1. O.A. McBryan, (cid:210)GENVL and WWW: Tools for Taming the
`
`Web,(cid:211) Proc. First Int(cid:213)l World Wide Web Conf., Elsevier, New
`
`York, 1994, pp. 79-90.
`
`2. S. Lawrence and C.L. Giles, (cid:210)Accessibility of Information on
`
`the Web,(cid:211) Nature, vol. 400, July 1999, pp. 107—109; http://
`
`www.neci.nj.nec.com/homepages/lawrence/papers.html.
`
`3. (cid:210)The Economic Impacts of Unacceptable Web Site Download
`
`Speeds,(cid:211) white paper, Zona Research, 1999; available at
`
`http://www.zonaresearch.com/deliverables/white_papers/
`
`wp17/index.htm.
`
`4. D. Wessels and K. Claffy, (cid:210)Internet Cache Protocol (ICP),(cid:211)
`
`version 2, Internet Eng. Task Force RFC 2186, Sept. 1997,
`
`available at http://ftp.isi.edu/in-notes/rfc2186.txt.
`
`5. A.J. Smith, (cid:210)Cache Memories,(cid:211) ACM Computing Surveys,
`
`vol. 14, no. 3, Sept. 1982, pp. 473—530.
`
`6. A. Chankhunthod et al., (cid:210)A Hierarchical Internet Object
`
`Cache,(cid:211) Proc. Usenix 1996 Ann. Technical Conf., Usenix
`
`Assoc., Berkeley,