Computing Practices
`NCSA's World Wide
`Web Server: Design
`and Performance
`T he recent explosion
`Thomas T. Kwan and
`Robert E. McGrath
`National Center for Super­
`computing Applications
`Daniel A. Reed
`University ofnlinois
`in the World Wide Web 0NWW) 1
`of interest
`can be traced to the distribution of the CERN (European
`Laboratory for Particle Physics in Geneva, Switzerland) and
`NCSA (National Center for Supercomputing Applications) servers and
`WWW client browsers. In particular, NCSA Mosaic, the graphical user
`interface for WWW browsing, based on distributed, multimedia hyper­
`text, has spawned several commercial variants and has made the Internet
`readily accessible to a much larger population than in the past.
`Network statistics from Merit, the NSFNet backbone management
`group, show that WWW traffic is the largest and by far the fastest grow­
`ing segment of the Internet, and growing numbers of government and
`commercial groups are making hundreds of gigabytes of data available
`via WWW servers. At the same time, the WWW servers at NCSAhave expe­
`rienced explosive growth in traffic, from 1 million requests per week in
`February 1994, to 2 million per week in June 1994, 3 million per week in
`September 1994, nearly 4 million per week in December 1994, and even
`larger numbers in 1995.2
`To support continued growth, WWW servers must manage a multigi­
`gabyte (in some instances a multiterabyte) database of multimedia infor­
`mation while concurrently serving multiple request streams. This places
`demands on the servers' underlying operating systems and file systems
`that lie far outside today's normal operating regime. Simply put, WWW
`servers must become more adaptive and intelligent. The first step on this
`path is understanding extant access patterns and responses. On the basis
`of this understanding, one can then develop more efficient and intelligent
`server and system file-caching and prefetching strategies.
`In this article, we describe extant access patterns and responses at
`NCSA's WWW server and the implications of that data. But first, we
`describe the context in which the data was collected...:...the NCSA WWW
`server architecture.
`- E
`xplosive Web traffic growth
`has placed burdens on Web
`servers that lie far outside
`today's normal operating
`regime. This article examines
`extant Web access patterns
`with the aim of developing
`Shortly after NCSA's WWW server was established, it became clear that
`the volume of WWW trafficwould stress operating systems and network
`implementations in ways not originally envisioned by their designers. At
`peak times, the NCSA server receives 30-40 new WWW requests per sec­
`ond, and because the Hypertext Transfer Protocol (HITP) is connection­
`less, each such request appears to the server as a separate network
`Not only were most implementations of the TCP lIP network protocol
`not designed to accept connections at this sustained rate, even conserva­
`tive projections of request rate growth showed that no single processor
`system could serve all requests. To support the growing request rate, NCSA
`has developed a scalable WWW architecture that consists of a group of
`loosely coupled WWW servers. Though the servers operate independently,
`more efficient file-caching and
`collectively they provide the illusion of a single server.
`Development of the NCSA architecture required resolution of three key
`prefetching strategies.
`1. Information addressing.
`Externally, the NCSA server
`of memory and uses its local disk as a moderate-size (130
`megabytes) AFS cache. In addition, the local disk stores
`has a single domain name (
`Incoming requests addressed to this domain name
`HTTP server log files and is the backing store for the vir­
`must be mapped to multiple servers, each with a sep­
`tual memory system.
`The WWW servers are connected to the AFS file servers
`arate, user-invisible domain name. This mapping
`via a 100-megabit! second Fiber Distributed Data Interface
`allows NCSA to invisibly add servers to accommodate
`ring (see Figure 1). The FODI ring connects to the rest of
`the growing number of incoming requests.
`2. Information distribution.
`Each server must be capable
`NCSA and to the Internet via a T3line.
`of responding to requests for any portion of the NCSA
`WWW server database. Otherwise, the servers must
`NCSA AFS configuration
`be more tightly coupled, an arbiter must distribute
`All documents provided by the NCSA WWW service are
`requests to servers on the basis of request type, and it
`served from the NCSA center-wide Andrew File System
`is likely that the arbiter will become the bottleneck.
`environment.6 This distributed file system is shared by
`3. Load balancing.
`many hundreds of client workstations and supercomput­
`The requests must be equally appor­
`tioned among the servers. Thus, newly added servers
`ers, as well as the WWW servers.
`will always share the load and contribute to the scal­
`AFS provides a single, consistent view of the file system
`ability of the implementation.
`to each WWW server, allowing each server to access the
`entire WWW documenttree.BecauseAFSclients (that is,
`the WWW servers) cache recently used files on their local
`The server architecture is based on three components:
`a collection of independent servers, a WWW document
`disks, the most frequently accessed documents are gen­
`tree shared among the servers and stored by the Andrew
`erally available locally, without remote disk access. In
`distributed file system (AFS),3 and a round-robin domain
`effect, AFS caching replicates the document tree on each
`name system (ONS) that multiplexes the domain name
`WWW server.
`Because AFS manages the shared document tree, the
` among the constituent servers.
`the NCSA WWW service
`With this architecture,
`is always
`individual WWW servers need not and do not know either
`the same, although the number and identity of the par­
`the number or identity of the other servers. It is difficult to
`ticular servers
`may change from day to day. Beginning with
`overemphasize the importance of this point. This allows
`rapid, "plug-and-play" addition (and removal) of compo­
`one server in February 1994, the architecture grew to four
`servers in May, eight servers in November, and nine in
`nent servers and the use of heterogeneous systems. In
`early 1995. To meet increasing demand, NCSA will con­
`practice, we have found that servers can be added or
`tinue to add servers as needed.
`removed from the ensemble in under an hour.
`Below, we briefly describe each of the three server com­
`ponents. For additional details on the server architecture,
`see Katz et a1. 4 and Kwan et a1. 5
`,�===l ;AFSfile,1
`�ver '
`The servers and the network
`The NCSA WWW server architecture is flexible enough
`to accommodate most Unix systems as component servers.
`The only requirements are that the systems function as AFS
`clients and support TCP lIP. The servers need not be homo­
`geneous; the particular systems in use vary from time to
`time and may be a heterogeneous collection of systems.
`To date, the backbone of the NCSA WWW service has
`I www server
`been a group of dedicated Hewlett-Packard HP 735 work­
`(AFS Clients)
`stations. Though these systems are not generally consid­
`'--- __ �I L __
`ered "servers," their efficient TCP lIP implementation has
`made them an effective choice to process WWW requests.
`Figure 1. The NCSA scalable WWW server.
`In the NCSA configuration, each HP 735 has 96 megabytes
`FOOl ring
`NCSA Mosaic-A freely available browser developed at
`Browser-Software that allows users to view documents
`NCSA. (NCSA Mosaic is a trademark of the Board of Trustees
`retrieved from the Internet.
`Client-Software responsible for communicating with
`of the University of Illinois.)
`for making local documents Server-Software responsible
`servers to retrieve necessary documents and files.
`to other software systems. or files available
`Firewall-A computer system and network interface that
`World Wide Web (WWW)-A global information system
`maintains security for an organization by filtering incom­
`ing networking requests.
`providing hypertext-linked access to resources on the
`The WWW also incorporates Internet. existing network ser­
`HTTP-Hypertext Transfer Protocol (HTIP) is a stateless
`protocol used by the WWW. data-transfer
`vices, such as FTP and Gopher.
`November 1995
`Our experience to date has been thatAFS's local caching
`Round-robin domain name system
`to the success of the NCSA WWW server archi­
`The third and final component of the NCSA scalable
`is critical
`WWW server is a modified network
`name resolver based
`tecture. Stateless distributed file systems (for example,
`on the Berkeley Internet Name Domain (BIND) code.?
`NFS) cannot exploit the locality inherent in the HTTP
`The existing BIND 4.9.2 code has a round-robin option
`request stream by locally caching frequently requested
`that can associate a single domain name with several IP
`items. Instead, they must repeatedly retrieve those items
`addresses. In response to requests, these addresses are dis­
`from a shared file server. Not only does this increase the
`tributed using a simple rotation algorithm. Because this
`load on the file server, it is inherently unscalable.
`Despite the advantages oflocal caching, much research
`rotation conflicted with extant software at NCSA, the
`remains before we learn how distributed file systems in gen­
`BIND software was modified to rotate only specific
`namely those of the WWW servers (see Katz et
`eral, and AFS in particular, support large, less frequently
`al. 4 for details) .
`accessed files (for example, 24-bit color images and digital
`The modified domain name system (DNS) allows a
`video clips). With standard caching algorithms, access to
`domain name with more than one associated IP address to
`these files will displace smaller, more frequently accessed
`be specified as "round-robin." Each incoming request forthe
`files from the local cache. Ifthese large, nontext files are not
`address of a round-robin domain name is satisfied by the
`cached, their access latencies will be large. Data-type­
`next IP address on the list in a simple rotation. Thus, l/Nth
`specific caching algorithms are one potential solution.
`www server performance visualization
`the small variations are due to differing request patterns.
`To gain insights into the large volume of access and per­
`formance data in the WWW logs, we relied on a variety of
`Figure B shows a global view of the origin of the requests.
`The height of the bar at each geographical location repre-·
`standard statistical data analysis tools. However, to under­
`sents the number of bytes requested by that location; the
`stand the dynamics of server behavior and the interactions
`different color segments represent the different data types.
`of request patterns with round-robin DNS system, we
`at 6 p.m., local time, of a typical Figure B shows the activity
`exploited the local availability of the CAVE, an immersive,
`workday. Because of the time zone difference, most of the
`unencumbered virtual environment, and our Avatar visual­
`requests at this time are originating from the west coast of
`ization software,' to create dynamic displays of server
`the United States. (The bar at the north pole represents sites
`behavior. Figures A and B show snapshots of this visualiza­
`that cannot be mapped to a specific geographical location.)
`tion from a "day in the life"of the NCSA WWW
`In Figure A, the trajectories of four different servers in
`For details about the infrastructure of this visualization
`see Reed et aI.' in this issue of Computer. environment,
`the performance metric space are denoted by the four col­
`ored ribbons. This snapshot, from near noon on September
`7,1994, shows that the round robin DNS system effectively
`1. D.A. Reed et aI., "Virtual Reality and Parallel Systems Perfor­
`balances the server load-the trajectories of all the servers
`Computer, Vol. 28, No. 11, Nov. 1995, p. 57-67.
`cluster in the same region of the performance metric space;
`mance Analysis,"
`Figure A. WWW server
`Figure B. Origin of WWW requests at 6 p.m. local time.
`Table 1. Server request origins
`response to the hardware
`of the DNS requests get each oftheN different IP addresses.
`by domain.
`and software capabilities of
`This allows NCSA to maintain a group of WWW servers
`the requesting platform.
`aliased by the single domain name
`Internet domain Percentage
`The user agent logs from
`Adding a new server to the group is as simple as adding its
`of requests
`the first 20 days of Decem­
`IP address to the DNS entry for
`! -E-d -u-ca-t-io-n--(e-d-u-)--------26---- ---
`ber 1994 show that 31 per­
`cent of all connections were
`Commercial (com) 18
`from X Windows clients,
`If the NCSA WWW service has accomplished nothing
`Government (gov) 5
`percent from Microsoft
`else, it has produced copious amounts of performance-and
`Windows clients, 20 per­
`access-pattern data. This data is collected continuously on
`cent from Macintoshes, and
`each server and is permanently archived each day to be
`21 percent from all other
`available for researchers. Collectively, the files constitute
`types of clients. This data shows that at least 58 percent of
`more than 150 megabytes of data each weekday.8
`the requests originate from personal computers. As ven­
`On each of the component WWW servers, the data col­
`dors continue to ship new and improved versions ofWWW
`lected includes
`browsers for personal computers, we expect requests from
`personal computers to grow at a very rapid rate. However,
`• the standard access logs from the NCSA HTTP dae­
`because of the relatively low bandwidth (modem) con­
`mons (httpd),
`nections from most personal computers to the Internet, it
`• the standard error logs from the httpd daemons,
`is becoming increasingly important for WWW servers to
`• a custom log of the client browser type (the "user
`adapt to client needs (for example, by sending lower reso­
`agent") that initiated each request,
`lution images) and for clients to prefetch selected data to
`• a trace of virtual memory statistics, obtained by
`Unix vmstat
`hide the long latency for data retrieval.
`data once each minute,
`• a trace of packet counts, obtained by recording Unix
`Domain characteristics
`data once each minute, and
`ps once
`Much discussion has centered on the commercial poten­
`• a count of active processes, sampled with
`tial of the World Wide Web and the increasing accessibil­
`every 5 minutes.
`ity of commercial information. To assess the number and
`distribution of commercial and other requesting sites, we
`aggregated domain names into a small number of broad
`To understand the access pattern and characteristics of
`categories: educational, commercial, government, and
`NCSA's WWW service, we analyzed the data described above
`other. Table 1 summarizes the fraction of requests from
`for selected weeks during five different months of 1994.
`the major Internet domains.
`Below, we present the qualitative results with respect to the
`Although Table 1 shows that the edu domain generates
`general access trends, the domain characteristics, and the
`file type distribution (see Kwan et al. 5 for details).
`more requests than any other single domain, Figure 2
`shows that the number of requests from commercial
`domains is growing rapidly. (For each month, the figure
`General trends
`shows seven data points, corresponding to Sunday
`Qualitatively, WWW traffic growth on the Internet is
`through Saturday of the week we analyzed during that
`well known. However, the specific characteristics of this
`month.) This reflects the increasing presence of commer-
`growth and the sources of requests are much less well
`understood. Hence, the initial goal of our analysis was a
`simple characterization ofWWW traffic in terms of request
`count, request data volume, and request sources (by hard­
`I • edu
`120 � C (')rll
`ware platform type).
`i .) -l0V
`TRAFFIc GROWTH. The number
`-0 c � :J
`of requests received by
`the NCSA WWW servers during the period of our analysis
`80 � ,--\,
`grew from about 300,000 per day in May 1994 to about
`.<:: .., '" .., '" CJJ
`60� ,
`i r" I
`500,000 per day in September. Thus, the compounded
`growth rate over the five· month period is roughly 14 per­
`I .
`cent per month. A scan of NCSA's January 1995 WWW
`40 ff
`server logs shows that the number of requests has
`to about 690,000 per day. As a result,
`'+-0 ... CJJ
`the com­
`tl • .
`pounded growth rate is about 11 percent per month from
`.0 E :J
`20r �;;,
`May 1994 to January 1995. Forthe restof1995, however,
`the number of requests to the NCSA server have slowly
`oJ '4
`decreased. (See File2 for the latest 1995 statistics.)
`June July Aug Sept
`CUENT PLATFORMS. Knowing the platform
`from which
`Figure 2. Weekly domain request statistics
`a request originated has great potential value. Information
`to September 1994. Each data point represents a
`providers can customize documents for different platforms,
`and servers can exploit this knowledge by tailoring their
`�-� L �,
`Sunday-Saturday cycle analyzed during the month.
`from May
`November 1995
`----- ----------
`• .. .
`\ ...
`� . \
`A �--...� .,
`? '
`fer rates. Interestingly, the temporal distribution of the
`cial Internet service providers and the growing use of the
`requests for audio and video is skewed toward later in the
`Internet by the staff of commercial organizations.
`day than the distribution of those for text and images. We
`Although the top 10 educational and government
`conjecture that users seek off-peak times to retrieve large
`domains (which generate the largest number of requests
`items from the server.
`to NCSA's server) change almost daily, the top 10 com­
`One should be chary about projecting access charac­
`mercial domain names change little. Indeed, most of the
`The NCSA WWW document tree is
`teristics from this data.
`top 10 commercial domain names on any given day were
`dominated by a large number of small objects.
`also among the top 10 domain names throughout the five
`document repositories mature, we expect them to contain
`months of data we analyzed.
`a much larger number oflarge scientific and technical data
`The domain names in the com domain are mainly net­
`sets, scientific visualizations and video clips, and audio
`work firewalls for large organizations; they have long con­
`segments. This shift will accentuate the behavior found in
`nection times and make an unusually large number of
`this study: Many of the requests will be for small data
`requests. Because a firewall acts as a central location for
`items, but an increasing fraction of the data volume will be
`accessing data outside a given organization, it is the ideal
`associated with requests for large, nontext items.
`location for implementing network caching and proxy
`servers, a topic to which we will return.
`To this point, our focus has been on the characteristics
`Media distributions
`As we noted above, the request rate to the NCSA WWW
`of the request stream. We turn now to an examination of
`the servers' "response" to the incoming request stream.
`server is growing at a compounded rate of between 11 and
`Effective, distributed file caching was one of the key
`14 percent per month. In addition to the rate, the charac­
`in NCSA's WWW server architecture.
`design principles
`teristics of the growth have important implications for
`Local caching at the WWW servers
`WWW server implementation.
`reduces the load on the
`For example, satisfying
`shared AFS file servers, minimizes file traffic on the FDDI
`large numbers of requests for small, text-based documents
`ring, and allows the WWW servers to respond quickly to
`is much easier than responding to large numbers of
`requests for frequently accessed documents. To measure
`requests for color images, video clips, or large data files.
`the effectiveness of the current AFS caching protocols, we
`Because the HTTPD server logs contain the name of the
`analyzed the WWW server logs to identify the character­
`document being requested, and the file extension can be
`istics of the most frequently requested documents.
`used to identify the document category, it is possible to
`As mentioned above, NCSA serves documents from the
`determine the relative request frequency for text, images,
`AFS distributed file system, which automatically caches
`audio, video, and data. The text category includes
`the most recently used files in local AFS client caches. The
`Hypertext Markup Language (HTML) documents, plain
`left portion of Figure 4 shows the number of distinct files
`text, and postscript files; the image category includes GIF,
`X bitmap (xbm) , JPEG, and RGB files; the audio category
`requested per day during the five months of our analysis;
`the right portion shows the total size of these same files.
`includes au, aiff, and aifc files; and the video category
`Comparing the two figures shows that although the
`includes MPEG and QuickTime files.
`number of distinct files requested has increased, the total
`Figure 3 shows that text and images account for the
`size of all the requested files has remained under 450
`majority of the requests. Although audio and video
`megabytes per day. Most of the newly added files have
`account for only 1 percent of the requests, they represent
`been small text and image files. To date, the AFS client
`28 percent of the bytes transferred. The requests for large
`cache hit ratios for the WWW servers have been near
`audio and video files also lead to more bursty data trans-
`350.,------ ----------------------
`300 L
`250 f--
`• Text
`� �ud:o
`-::: '/Ideo
`� '.
`. .
`• Text
`o Images
`o Audio
`* Video
` 2
`.r .:; >-
`u 2:
`Time of day (hour� . ___ _
`Figure 3. File type statistics
`by rate (left) and volume (right).
`20 23
`4 8
`Time ()f day (hours)
`searches) that must be retained by a WWW server.
`percent, suggesting that AFS caching has worked quite
`Supporting such extensions may be difficult for a multi­
`well for the past access patterns.
`server architecture that relies on round-robin DNS. A sec­
`Note that not only does the AFS file system cache fre­
`ond request may be sent to a different server than the one
`quently accessed files on the local disk of the WWW servers,
`holding the result of the previous request. Unless the data
`but also the most frequently accessed of those files are
`is shared (for example, via AFS), obtaining the requisite
`cached in the primary memory of the WWW servers. With
`information will require closer server cooperation, with
`the observed access patterns to NCSA's WWW servers, less
`associated overhead.
`than 60 megabytes of primary memory cache space is
`needed to satisfy 95 percent of all incoming requests, which
`HTTP protocol extensions
`corresponds to roughly 800 distinct files. Though most
`The overriding trend from our data analysis is the con­
`requests are small, a small number of requests retrieve large
`tinued growth in request rate. Currently, each request
`items. For this reason, satisfying 95 percent of the requests
`from the client uses a separate TCP connection, and the
`represents only 80 percent of the total data volume.
`large number of short-lived TCP connections limits the
`performance of the server. This problem is exacerbated by
`As the number of requests to NCSA's and other WWW
`the fact that a document may be composed of several
`pieces, each of which is fetched separately, with each fetch
`servers continues to grow, the continued scalability of the
`requiring a separate TCP/IP connection. Padmanabhan
`server architecture, the efficiency of the HTTP protocol,
`and Mogul9 have proposed opening a single TCP connec­
`and the effectiveness of caching strategies become increas­
`tion per HTML document to avoid unnecessaryTCP over­
`ingly critical research and implementation issues. Let's
`head; preliminary experiments show that this reduces
`examine salient aspects of each issue.
`document retrieval latency. Spero10 has proposed a new
`protocol, HTTP-NG, which dramatically alters HTTP to
`Scalability and persistent state
`reduce overhead, allow more parallelism, and efficiently
`Although round-robin DNS has allowed NCSA to add
`support features such as authentication.
`WWW servers without piercing the illusion that
`www .ncsa.uiuc.eduisa singlehost, the use of round-robin
`These and related protocol changes will reduce the
`latency to deliver data and transmit more data over each
`DNS is not an ideal solution to either the decoupling of
`TCP lIP connection. It will make HTTP servers much more
`logical WWW server names from the physical server iden­
`like FTP and other session-oriented
`services. This may well
`tity or to request load balancing. With this approach, the
`make much better use of the available network bandwidth
`distribution ofWWW server addresses is divorced from
`and other server resources.
`the characteristics and load of the constituent servers.
`While the round-robin mechanism equally distributes
`the IP addresses of the constituent servers, there is no
`Distributed caching and prefetching
`Beyond reducing the network protocol overhead, one
`mechanism to limit the number of times an address is used
`can also aggressively cache and pre fetch the data. At the
`after it is distributed, or to guarantee that the client sys­
`moment, various browsers cache data on local client disks
`tem will honor the advertized time to live (TTL). For
`to improve performance. Pitkow and Reckerll have shown
`instance, a local DNS service might distribute a single IP
`that caching based on recent rates of past access is an effec­
`address to any number of clients in its domain.
`tive technique. However, to design and implement effec­
`Moreover, envisioned extensions to HTTP includelong­
`lasting state (for example, the results of previous database
`tive prefetching, one must first study and understand the
`6,000 ;
`'" 4,000 f-
`! I •
`, • ,
`• 1\ �,
`\ /' �i
`Ii, --�
`,� \ i \
`, ""� '� ,
`0 ,
`'" 400 :.;
`'" ?'
`g 300
`J:: ... >-
`'0 200
`� 100lL
`1l +"'
`� J May
`-------- ------
`- - .�
`Each data point represents a Sunday-Saturday cycle analyzed during that month.
`November 1995
`3,000 to' \ I
`.:::l E ::l 2,000 �
`1,000� I
`June July �:"uJ Sept
`May June July Aug Sept
`Figure 4. Request profile: number of distinct files requested (left) and total size of all files requested (right).
`extant access patterns. Our data suggests that partitioned
`3. M. Satyanarayanan, "Scalable, Secure, and Highly Available
`caches are a promising alternative. However, prototype
`Distributed File Access," Computer, Vol. 23, No.5, May 1990,
`implementations and trace-driven simulations are needed
`to measure the performance benefits that might accrue
`4. E.D. Katz, M. Butler, and R. McGrath, "A Scalable HTTP
`from this approach.
`Computer Networks and ISDN
`Server: The NCSA Prototype,"
`We noted that the most prolific sites are all commercial
`Systems, Vol. 27, 1994, pp. 155·164.
`5. T.T. Kwan, R.E. McGrath,
`gateways. Moreover, about 2 percent of the requests to the
`and D.A. Reed, "User Access Pat­
`NCSA WWW servers are from hosts that make only one
`terns to NCSA's World Wide Web Server," Tech. Report
`request. The most popular of these requests are to the
`UIUCDCS-R-95-1934, Dept. Computer Science, Univ. of illi­
`"directory" pages, namely the NCSA Internet Starting
`nois, Urbana-Champaign, Feb. 1995.
`Points, the Internet Resources Meta-Index, and the What's
`6. NCSAAFS Users Guide, National Center for Supercomputing
`New pages. These pages are excellent candidates for repli­
`Applications, Univ. of Illinois, Urbana-Champaign, 1994
`cation and caching throughout the Internet, particularly
`( AFS­
`at commercial gateways.
`Guide/ AFSv2.100.html).
`In the future, as audio and video clips playa larger role
`7. P. Albitz and C. Liu, DNS and BIND in a Nutshell, O'Reilly and
`in conveying multimedia information, audio and video
`Associates, Sebastopol, Calif., 1992.
`requests will significantly affect network traffic and
`8. R.E. McGrath, What We Do and Don't Know About the Load on
`caching strategies. As we have seen, even a small increase
`the NCSA WWW Server, Sept. 1994 (http://www.ncsa.uiuc.
`in the use of these data types will dramatically increase
`the amount of data to be read and transmitted, with a con­
`9. Y.N. Padmanabhan and J.C. Mogul, "Improving HTTP
`comitant deleterious effect on the efficiency of server
`Latency," Pmc. Second Int'l WWWConf., 1994, pp. 995-1,005
`caching strategies.
`moguI!HTTPLatency.html) .
`10. S. Spero, Progress on HTTP-NG, 1994 (
`and analyzed the access patterns to the server in terms of
`hypertext!WWW /Protocols/ HTTP-NG/http-ng-status.html).
`the user request patterns and the responses of the server.
`11. J.E. Pitkow and M.M. Recker, "A Simple Yet Robust Caching
`The analysis shows that scalability, protocol efficiency, and
`Froc. Second
`Algorithm Based on Dynamic Access Patterns,"
`effective caching strategies are the major issues for the
`Int'l WWW Conf., 1994, pp. 1,039-1,046 (http://www.ncsa.uiuc.
`next generation ofWWW servers. In particular, we believe
`ed u/SDG/IT94/Proceedings/DDay /pitkow / caching.html).
`that to improve performance, both clients and servers
`Thomas Kwan is a doctoral candidate
`must aggressively exploit caching and prefetching on the
`in the Department
`at Urhana­of Computer Science at the University of Illinois
`basis of knowledge of request patterns, data types, and
`hardware capabilities.
`Champaign and a graduate research assistant at the
`National Center for Supercomputing Applications. His
`research interests include parallel computing, gigabit appli­
`Our thanks to Eric Katz for providing us with the initial
`cations, and World Wide Web technology. He received a BS
`degree in electrical engineeringfrom the University ofWash­
`log analysis scripts, to Nancy Yeager, Michelle Butler, and
`ington and an MS degree in computer science from the Uni­
`Paul Zawada for providing crucial assistance in under­
`versity of Illinois at Urbana-Champaign. He is a member of
`standing the NCSA WWW server, and to Charlie Catlett,
`IEEE, ACM, and Tau Beta Pi.
`without whom this work would not have been possible.
`Finally, thanks to Will Scullin and Steve Lamm for devel­
`Robert McGrath is a research programmer at the
`oping the virtual reality software used to display dynamic
`server behavior.
`National Center for Supercomputing Applications. His
`research centers on the architecture and performance of
`Thomas Kwan is supported in part by the National
`Science Foundation and the Advanced Research Projects
`large-scale distributed systems. He is a coauthor of

