`
`IIICSAs World Wide
`Web Server Design
`and Performance
`
`Kwan and
`Thomas
`Robert McGrath
`National Centerfor Super-
`computingApplications
`
`Daniel
`
`Reed
`
`University of Illinois
`
`Explosive Web traffic growth
`
`has placed burdens on Web
`
`servers that lie far outside
`
`todays normal operating
`
`regime This article examines
`
`extant Web access patterns
`
`with the aim of developing
`
`more efficient
`
`file-caching and
`
`prefetching strategies
`
`in the World Wide Web WWW
`he recent explosion of interest
`can be traced to the distribution of the CERN European
`for Particle Physics in Geneva Switzerland and
`Laboratory
`NCSA National Center for Supercomputing Applications servers and
`WWW client browsers In particular NCSA Mosaic the graphical user
`interface for WWW browsing based on distributed multimedia hyper
`and has made the Internet
`text has spawned several commercialvariants
`readily accessible to much larger population than in the past
`Network statistics from Merit the NSFNet
`backbone management
`group show that WWW traffic is the largest and by far the fastest grow
`ing segment of the Internet and growing numbers of government and
`commercial groups are making hundreds of gigabytes of data available
`via WWWservers At the same time the WWW servers at NCSA have expe
`rienced explosive growth in traffic from million requests per week in
`February 1994 to million per week in June 19943 million per week in
`September 1994 nearly million per week in December 1994 and even
`larger numbers in 1995.2
`To support continued growth WWW servers must manage multigi
`gabyte in some instances multiterabyte database ofmultimedia infor
`mation while concurrently serving multiple request streams This places
`demands on the servers underlying operating systems and file systems
`that lie far outside todays normal operating regime Simply put WWW
`servers must become more adaptive and intelligent The first step on this
`path is understanding extant access patterns and responses On the basis
`one can then develop more efficient and intelligent
`of this understanding
`server and system file-caching and prefetching strategies
`In this article we describe extant access patterns and responses at
`NCSAs WWW server and the implications of that data But first we
`in which the data was collectedthe NCSA WWW
`describe the context
`server architecture
`
`NCSA WWW SERVER ARCHITECTURE
`Shortly after NCSAs WWW serverwas established it became clearthat
`the volume of WWW traffic would stress operating systems and network
`implementations in ways not originally envisioned by their designers At
`peak times the NCSA server receives 30-40 newWWW requests per sec
`ond and because the Hypertext Transfer Protocol HTTP is connection-
`to the server as
`less each such request appears
`separate network
`connection
`
`Not only were most implementations of the TCP/IP network protocol
`not designed to accept connections at this sustained rate even conserva
`tive projections of request rate growth showed that no single processor
`system could serve all requests To support the growing request rate NCSA
`scalable WWW architecture that consists of
`has developed
`group of
`loosely coupled WWW servers Though the servers operate independently
`collectively they provide the illusion of
`single server
`Development of the NCSA architecture required resolution of three key
`problems
`
`Computer
`
`0018-91621951$4OO
`
`1995 IEEE
`
`Petitioner IBM – Ex. 1068, p. 1
`
`
`
`Information addressing Externally the NCSA server
`single domain name www.ncsa.uiuc.edu
`has
`Incoming requests addressed to this domain name
`must be mapped to multiple servers each with sep
`arate user-invisible domain name This mapping
`allows NCSA to invisibly add servers to accommodate
`the growing number of incoming requests
`Information distribution Each server must be capable
`of responding to requests for any portion of the NCSA
`WWW server database Otherwise the servers must
`be more tightly coupled an arbiter must distribute
`requests to servers on the basis of request type and it
`is likely that the arbiter will become the bottleneck
`Load balancing The requests must be equally appor
`tioned among the servers Thus newly added servers
`will always share the load and contribute to the scal
`ability of the implementation
`
`The server architecture is based on three components
`collection of independent servers WWW document
`tree shared among the servers and stored by the Andrew
`distributed file system AFS and round-robin domain
`name system DNS that multiplexes the domain name
`www.ncsa.uiuc.edu among the constituent servers
`With this architecture the NCSA WWW service is always
`the same although the number and identity of the par
`ticularservers may change from dayto day Beginningwith
`one server in February 1994 the architecture grew to four
`servers in May eight servers in November and nine in
`early 1995 To meet increasing demand NCSA will con
`tinue to add servers as needed
`Below we briefly describe each of the three server com
`ponents For additional details on the server architecture
`see Katz et al.4 and Kwan et al.5
`
`The servers and the network
`The NCSA WWW server architecture is flexible enough
`to accommodate most Unix systems as component servers
`The only requirements are that the systems function as AFS
`clients and support TCP/IP The servers need not be homo
`geneous the particular systems in use vary from time to
`time and maybe
`heterogeneous collection of systems
`To date the backbone of the NCSA WWW service has
`group of dedicated Hewlett-Packard HP 735 work
`been
`stations Though these systems are not generally consid
`ered servers their efficient TCP/IP implementation has
`made them an effective choice to process WWW requests
`In the NCSA configuration each HP 735 has 96 megabytes
`
`of memory and uses its local disk as moderate-size 130
`megabytes AFS cache In addition the local disk stores
`HTTP server log files and is the backing store for the vir
`tual memory system
`The WWW servers are connected
`via 100-megabit/second Fiber Distributed Data Interface
`The FDDI
`ring see Figure
`ring connects to the rest of
`NCSA and to the Internet via T3 line
`
`to the AFS file servers
`
`NCSA AFS configuration
`All documents provided by the NCSA WWW service are
`served from the NCSA center-wide Andrew File System
`environment.6 This distributed file system is shared by
`many hundreds of client workstations and supercomput
`ers as well as the WWW servers
`AFS provides single consistent view of the file system
`to each WWW server allowing each server to access
`the
`entire WWW document tree Because AFS clients that is
`the WWW servers cache recently used files on their local
`disks the most frequently accessed documents are gen
`erally available locally without
`remote disk access
`In
`effect AFS caching replicates the document
`tree on each
`WWW server
`Because AFS manages the shared document
`tree the
`individual WWW servers need not and do not know either
`the number or identity of the other servers It
`overemphasize the importance of this point This allows
`rapid plug-and-play addition and removal of compo
`nent servers and the use of heterogeneous systems In
`practice we have found that servers can be added or
`removed from the ensemble in under an hour
`
`is difficult
`
`to
`
`AFS Chents
`The NCSA scalable WVIIW server
`
`Figure
`
`Glossary
`BrowserSoftware that allows users to view documents
`retrieved from the Internet
`ClientSoftware
`for communicating with
`responsible
`servers to retrieve necessary documents and files
`FirewallA computer system and network interface that
`maintains security for an organization by filtering incom- World Wide Web WWWA global
`ing networking requests
`Transfer Protocol HTTP is
`HTTPHypertext
`data-transfer protocol used by the WWW
`
`NCSA MosaicA freely available browser developed at
`NCSA NCSA Mosaic is
`trademark of the Board of Trustees
`of the University of
`Illinois
`ServerSoftware responsible for making local documents
`or files available to other software systems
`information system
`providing hypertext-linked access to resources on the
`Internet The WWWalso incorporates existing network ser
`vices such as FTP and Gopher
`
`stateless
`
`November 995
`
`Petitioner IBM – Ex. 1068, p. 2
`
`
`
`is critical
`
`it
`
`Our experience to date has been that MSs local caching
`to the success of the NCSA WWW server archi
`tecture Stateless distributed file systems for example
`NFS cannot exploit
`in the HTTP
`the locality inherent
`request stream by locally caching frequently requested
`items Instead they must repeatedly retrieve those items
`from shared file server Not only does this increase the
`load on the file server
`is inherently unscalable
`Despite the advantages of local caching much research
`remains before we learn how distributed file systems in gen
`eral and AFS in particular support
`large less frequently
`accessed files for example 24-bit color images and digital
`video clips With standard caching algorithms access
`these files will displace smaller more frequently accessed
`files from the local cache Ifthese large nontext files are not
`cached their access latencies will be large Data-type-
`specific caching algorithms are one potential solution
`
`Round-robin domain name system
`The third and final component of the NCSA scalable
`WWW server is modified network name resolver based
`on the Berkeley Internet Name Domain BIND code.7
`The existing BIND 4.9.2 code has
`round-robin option
`single domain name with several
`that can associate
`IP
`addresses In response to requests these addresses are dis
`simple rotation algorithm Because this
`tributed using
`rotation conflicted with extant software at NCSA the
`BIND software was modified to rotate only specific
`addresses namelythose of the WWW servers see Katz et
`
`al.4 for details
`The modified domain name system DNS allows
`domain name with more than one associated IP address to
`be specified as round-robin Each incoming request
`forthe
`round-robin domain name is satisfied by the
`address of
`in simple rotation Thus 1/Nth
`next IP address on the list
`
`to
`
`WWW server performance visualization
`To gain insights into the large volume of access and per
`formance data in the WWW logs we relied on
`variety of
`to under
`standard statistical data analysis tools However
`stand the dynamics of server behavior and the interactions
`request patterns with round-robin DNS system we
`of
`exploited the local availability of the CAVE an immersive
`unencumbered virtual environment and our Avatar visual
`ization software1 to create dynamic displays of server
`show snapshots of this visualiza
`and
`behavior Figures
`tion from day in the lifeof the NCSA WWW server
`the trajectories of four different servers in
`In Figure
`the performance metric space are denoted by the four col
`from near noon on September
`ored ribbons This snapshot
`71994 shows that the round robin DNS system effectively
`balances the server loadthe trajectories of all
`the servers
`cluster in the same region of the performance metric space
`
`the small variations are due to differing request patterns
`shows
`global view of theorigin of the requests
`Figure
`The height of the bar at each geographical
`location repre-
`sents the number of bytes requested by that location the
`the different data types
`different color segments represent
`shows the activity at p.m local
`time of
`Figure
`typical
`workday Because of the time zone difference most of the
`requests at this time are originating from the west coast of
`the United States The bar at the north pole represents sites
`that cannot be mapped to specific geographical
`location
`the infrastructure of this visualization
`For details about
`environment see Reed et al.1
`
`this issue of Computer
`
`Reference
`DA Reed etal Virtual Reality and Parallet Systems perfor
`mance Analysis Computer Vol.28 No 11 Nov 1995
`57-67
`
`Figure
`
`WWW server visualization
`
`Figure
`
`Origin of WWW requests at
`
`p.m local tim
`
`Computer
`
`Petitioner IBM – Ex. 1068, p. 3
`
`
`
`different IP addresses
`
`of the DNS requests geteachof the
`group of WWW servers
`This allows NCSA to maintain
`aliased by the single domain name www.ncsa.uiuc.edu
`new server to the group is as simple as adding its
`Adding
`IP address to the DNS entry forwww.ncsa.uiuc.edu
`
`DATA COLLECTION
`If the NCSA WWW service has accomplished nothing
`else it has produced copious amounts of performance- and
`access-pattern data This data is collected continuously on
`each server and is permanently archived each day to be
`available for researchers Collectively the files constitute
`more than 150 megabytes of data each weekday.8
`On each of the component WWW servers the data col
`
`lected includes
`
`the standard access logs from the NCSA HTTP dae
`mons httpd
`the standard error logs from the httpd daemons
`type the user
`custom log of the client browser
`agent
`that initiated each request
`trace of virtual memory statistics obtained by
`recording Unix vmstat data once each minute
`trace of packet counts obtained by recording Unix
`netstat data once each minute and
`count of active processes sampled with p.c once
`every minutes
`
`REQUEST PAflERN ANALYSIS
`To understand the access pattern and characteristics of
`NCSAs WWW service we analyzed the data described above
`for selected weeks during five different months of 1994
`Below we present the qualitative results with respect
`to the
`trends the domain characteristics and the
`general access
`file type distribution see Kwan et al.5 for details
`
`General trends
`Qualitatively WWW traffic growth on the Internet
`well known However
`the specific characteristics of this
`growth and the sources of requests are much less well
`understood Hence the initial goal of our analysis was
`simple characterization ofWWW traffic in terms ofrequest
`request data volume and request sources by hard
`count
`ware platform type
`
`is
`
`TRAFHC GROWTH The number ofrequests received by
`the NCSA WWW servers during the period of our analysis
`grew from about 300000 per day in May 1994 to about
`500000 per day in September Thus the compounded
`growth rate overthe five-month period is roughly 14 per
`scan of NCSAs January 1995 WWW
`cent per month
`logs shows that
`the number of
`requests has
`server
`result the com
`increased to about 690000 per day As
`pounded growth rate is about 11 percent per month from
`May1994 to January 1995 For the rest of 1995 however
`to the NCSA server have slowly
`the number of requests
`decreased See File2 for the latest 1995 statistics
`
`Server request origins
`Table
`by domain
`
`Internet domain
`
`Percentage
`of requests
`
`Education edu
`Commercial com
`Government gov
`Others
`
`26
`
`18
`
`51
`
`to the hardware
`
`response
`and software capabilities of
`the requesting platform
`The user agent
`logs from
`the first 20 days of Decem
`ber 1994 show that 31 per
`cent of all connections were
`fromXWindows
`
`clients 38
`
`from Microsoft
`percent
`Windows clients 20 per
`cent from Macintoshes and
`from all other
`21 percent
`types of clients This data shows that at least 58 percent of
`the requests originate from personal computers As ven
`dors continue to ship new and improved versions ofWWW
`requests from
`browsers for personalcomputerswe
`expect
`personal computers to grow at
`very rapid rate However
`because of the relatively low bandwidth modem con
`nections from most personal computers to the Internet it
`for WWW servers to
`is becoming increasingly important
`adapt to client needs for example by sending lower reso
`lution images and for clients to prefetch selected data to
`hide the long latency for data retrieval
`
`Domain characteristics
`Much discussion has centered on the commercial poten
`tial of the World Wide Web and the increasing accessibil
`information To assess the number and
`ity of commercial
`distribution of commercial and other requesting sites we
`small number of broad
`aggregated domain names into
`categories educational commercial government and
`other Table
`summarizes the fraction of requests from
`the major Internet domains
`shows that the edu domain generates
`Although Table
`more requests than any other single domain Figure
`shows that
`the number of requests
`from commercial
`domains is growing rapidly For each month the figure
`shows seven data points corresponding
`to Sunday
`through Saturday of the week we analyzed during that
`month This reflects the increasing presence of commer
`
`eaL
`
`gov
`
`120
`
`locH
`
`80
`
`60
`
`40
`
`It
`
`-c
`
`-a
`
`20 fr4
`
`cLIENT PLArF0RMS Knowing the platform from which
`request originated has great potential value Information
`providers can customize documents for different platforms
`and servers can exploit
`
`this knowledge by tailoring their
`
`Figure Weekly domain request statistics from May
`to September 1994 Each data point
`represents
`cycle analyzed during the month
`Sunday-Saturday
`
`May
`
`June
`
`July
`
`Aug
`
`Sept
`
`November 995
`
`Petitioner IBM – Ex. 1068, p. 4
`
`
`
`cial Internet service providers and the growing use of the
`Internet by the staff of commercial organizations
`and government
`Although the top 10 educational
`domains which generate the largest number of requests
`to NCSAs server change almost daily the top 10 com
`mercial domain names change little Indeed most of the
`top 10 commercial domain names on any given day were
`also among the top 10 domain names throughout
`the five
`months of data we analyzed
`The domain names in the com domain are mainly net
`work firewalls for large organizations they have long con
`nection times and make an unusually large number of
`location for
`requests Because
`is the ideal
`accessing data outside
`given organization it
`location for implementing network caching and proxy
`servers topic to which we will return
`
`firewall acts as
`
`central
`
`Media distributions
`As we noted above the request rate to the NCSA WWW
`compounded rate of between 11 and
`server is growing at
`14 percent per month In addition to the rate the charac
`teristics of the growth have important implications for
`WWW server implementation For example satisfying
`large numbers ofrequests for small text-based documents
`is much easier than responding to large numbers of
`requests for color images video clips or large data files
`Because the HTTPD server logs contain the name of the
`document being requested and the file extension can be
`used to identify the document category it
`is possible to
`determine the relative request frequencyfortext
`images
`audio video and data The
`includes
`text category
`Hypertext Markup Language HTML documents plain
`files the image category includes GIF
`text and postscript
`Xbitmap xbm JPEG and RGB files the audio category
`includes au aiff and aifc files and the video category
`includes MPEG and QuickTime files
`shows that
`text and images account
`for the
`Figure
`audio and video
`the requests Although
`account
`percent of the requests they represent
`for only
`28 percent of the bytes transferred The requests for large
`audio and video files also lead to more bursty data trans
`
`majority of
`
`fer rates Interestingly the temporal distribution of the
`requests for audio and video is skewedtoward later in the
`day than the distribution ofthose for text and images We
`conjecture thatusers seek off-peaktimes to retrieve large
`items from the server
`One should be chary about projecting access charac
`teristics from this data The NC SA WWW document
`tree is
`large number of small objects As WWW
`dominated by
`document repositories mature we expect
`them to contain
`much larger number of large scientific and technical data
`and video clips and audio
`sets scientific visualizations
`segments This shift will accentuate the behavior found in
`this study Many of the requests will be for small data
`items but an increasing fraction of the datavolume will be
`items
`associated with requests for large nontext
`
`SERVER CACHING
`To this point our focus has been on the characteristics
`of the request stream We turn now to an examination of
`the servers response to the incoming request stream
`Effective distributed file caching was one of the key
`design principles in NCSAs WWW server architecture
`Local caching at the WWW servers reduces the load on the
`shared AFS file servers minimizesfile traffic on the FDDI
`ring and allows the WWW servers to respond quickly to
`requests for frequently accessed documents To measure
`the effectiveness of the current AFS caching protocols we
`analyzed the WWW server logs to identify the character
`istics of the most frequently requested documents
`As mentioned above NCSA serves documents from the
`AFS distributed file system which automatically caches
`the most recently used files in local AFS client caches The
`shows the number of distinct files
`left portion of Figure
`requested per day during the five months of our analysis
`the right portion shows the total size of these same files
`Comparing the two figures shows that although the
`number of distinct
`files requested has increased the total
`the requested files has remained under 450
`size of all
`megabytes per day Most of the newly added files have
`been small text and image files To date the AFS client
`cache hit ratios for the WWW servers have been near 90
`
`Text
`
`ages
`AuUo
`
`Video
`
`350
`
`300
`
`250
`
`200
`
`15O
`
`lot
`
`50
`
`cc
`
`ss
`
`ci
`
`12
`16
`Time of day hours
`
`20
`
`23
`
`Time of day hour5
`
`20
`
`23
`
`Figure
`
`File type statistics by rate left and volume right
`
`Computer
`
`Petitioner IBM – Ex. 1068, p. 5
`
`
`
`has worked quite
`
`suggesting that AFS caching
`percent
`well for the past access patterns
`Note that not only does the AFS file system cache fre
`files on thelocaldisk of the WWW servers
`quently accessed
`but also the most frequently accessed of those files are
`cached in the primary memoryof the WWW servers With
`the observed access patterns to NCSAs WWW servers less
`than 60 megabytes of primary memory cache space is
`needed to satisfy 95 percent of all incoming requests which
`files Though most
`corresponds to roughly 800 distinct
`requests are small asmallnumber of requests retrieve large
`items For this reason satisfying 95 percent of the requests
`represents only 80 percent of the total data volume
`
`IMPLICATIONS
`As the number of requests to NCSAs and other WWW
`servers continues to grow the continued scalability of the
`the efficiency of the HTTP protocol
`server architecture
`and the effectiveness of caching strategies become increas
`research and implementation issues Lets
`ingly critical
`examine salient aspects of each issue
`
`Scalability and persistent state
`Although round-robin DNS has allowed NCSA to add
`WWW servers without piercing
`the illusion that
`
`www.ncsa.uiuc.edu is
`the use of round-robin
`singlehost
`DNS is not an ideal solution to either the decoupling of
`logical WWW server names from the physical server iden
`tity or to request load balancing With this approach the
`distribution of WWW server addresses is divorced from
`the characteristics and load of the constituent servers
`While the round-robin mechanism equally distributes
`the IP addresses of the constituent
`servers there is no
`mechanism to limitthe number of times an address is used
`the client sys
`
`is distributed or to guarantee that
`after it
`the advertized time to live Tm For
`tem will honor
`local DNS service might distribute
`instance
`address to any number of clients in its domain
`Moreover envisioned extensions to HYFP include long-
`lasting state for example the results of previous database
`
`single IP
`
`that must be retained by WWW server
`searches
`Supporting such extensions may be difficult
`server architecture that relies on round-robin DNS
`ond request may be sent to
`different server than the one
`holding the result of the previous request Unless the data
`is shared for example via AFS obtaining the requisite
`information will require closer server cooperation with
`associated overhead
`
`for multi-
`sec
`
`HTTP protocol extensions
`The overriding trend from our data analysis is the con
`tinued growth in request rate Currently each request
`separate TCP connection and the
`from the client uses
`large number of short-lived TCP connections limits the
`performance of the server Thisproblemis exacerbated by
`document may be composed of several
`the fact that
`pieces eachofwhich is fetched separately witheachfetch
`separate TCP/IP connection Padmanabhan
`requiring
`single TCP connec
`and Mogul9 have proposed opening
`tion per HTML document
`to avoid unnecessary TCP over
`head preliminary experiments show that
`this reduces
`latency Spero has proposed
`document
`new
`retrieval
`protocol HflP-NG which dramatically alters HflP to
`reduce overhead allow more parallelism and efficiently
`features such as authentication
`support
`These and related protocol
`reduce the
`changes will
`latency to deliver data and transmit more data over each
`TCP/IP connection It willmake HYFP servers much more
`like FTP and othersession-oriented services This maywell
`make much better use of the available network bandwidth
`
`and other server resources
`
`Distributed caching and prefetching
`Beyond reducing the network protocol overhead one
`can also aggressively cache and prefetch the data At the
`moment various browsers cache data on local client disks
`to improve performance Pitkow and Recker have shown
`that caching based on recent
`rates of past access is an effec
`tive technique However
`to design and implement effec
`tive prefetching one must first study and understand the
`
`6G00
`
`5000
`
`4000
`
`ci
`-o
`
`3OOO
`
`o2
`
`2000
`
`1000
`
`ras\ \i
`
`T\
`
`-o
`
`400
`
`300
`
`200
`
`May
`
`June
`
`July
`
`Aug
`
`Sept
`
`May
`
`June
`
`Ju
`
`Aug
`
`Sept
`
`Request profile number of distinct files requested left and total size of all files requested right
`Figure
`cycle analyzed during that month
`Each data point represents
`Sunday-Saturday
`
`November1995
`
`Petitioner IBM – Ex. 1068, p. 6
`
`
`
`extant access patterns Our data suggests that partitioned
`promising alternative However prototype
`caches are
`implementations and trace-driven simulations are needed
`to measure the performance benefits that might accrue
`from this approach
`We noted that the most prolific sites are all commercial
`gateways Moreover about
`percent of the requests to the
`NCSA WWW servers are from hosts that make only one
`request The most popular of these requests are to the
`directory pages namely the NCSA Internet Starting
`Points the Internet Resources Meta-Index and the Whats
`New pages These pages are excellent candidates for repli
`cation and caching throughout
`the Internet particularly
`at commercial gateways
`In the future as audio and video clips play
`larger role
`in conveying multimedia information audio and video
`and
`requests will significantly affect network traffic
`caching strategies As we have seen even
`small increase
`in the use of these data types will dramatically increase
`the amount of data tobe read and transmitted with con
`comitant deleterious effect on the efficiency of server
`
`caching strategies
`
`WE HAVE DESCRIBED THE DESIGN OF NCSAs WWW SERVER
`and analyzed the access patterns to the server in terms of
`the user request patterns and the responses of the server
`The analysis shows that scalability protocol efficiency and
`effective caching strategies are the major issues for the
`next generation ofWWW servers In particular we believe
`to improve performance both clients and servers
`that
`must aggressively exploit caching and prefetching on the
`basis of knowledge of request patterns data types and
`hardware capabilities
`
`Acknowledgments
`-________
`Our thanks to Eric Katz for providing us with the initial
`log analysis scripts to Nancy Yeager Michelle Butler and
`Paul Zawada
`in under
`for providing crucial assistance
`standing the NCSA WWW server and to Charlie Catlett
`without whom this work would not have been possible
`Finally thanks to Will Scullin and Steve Lamm for devel
`oping thevirtual realitysoftware used to display dynamic
`server behavior
`Thomas Kwan is supported in part by the National
`Science Foundation and the Advanced Research Projects
`Agency under Cooperative Agreement NCR-8919038 with
`the Corporation for National Research Initiatives Robert
`McGrath is supported in part by the National Science
`Foundation the Advanced Research Projects Agency cor
`porate partners and the state and University of Illinois
`Daniel Reed is supported in part by the National Science
`Foundation under grants NSF IRI 92-12976 and NSF
`CDA94-01124
`by the National Aeronautics and Space
`Administration under Contract NAG- 1-613 and by the
`Advanced
`Research Projects Agency under Contracts
`DABT63-91 -C-0004 and DABT63-93-C-0040
`
`References ___________
`Berners-Lee eta The World-Wide Web Comm.ACM
`Vol 37 No
`Aug 1994 pp 76-82
`Total Connections 1994 http//www
`html
`
`ncsa uiuc.edu/SDG/Presentations/Stats/WebServer
`
`File WebServerActivity
`
`Computer
`
`Satyanarayanan Scalable Secure and 1-Iighly Available
`Distributed File Access Computer Vol 23 No.5 May 1990
`pp 9-21
`ED Katz
`Scalable HTTP
`Butler and
`McGrath
`Server The NCSA Prototype ComputerNetworks and ISDN
`SystemsVol 27 1994 pp 155-164
`T.T Kwan RE McGrath and D.A Reed UserAccess Pat
`terns to NCSAs World Wide Web Server Tech Report
`UIUCDCS-R-95-1934 Dept Computer Science Univ of Illi
`nois Urbana-Champaign Feb 1995
`NCSAAFS Users Guide National Center for Supercomputing
`1994
`Applications Univ of Illinois Urbana-Champaign
`
`http//www.ncsa.uiuc.edu/Pubs/UserGuides/AF5
`
`Albitz and
`
`Guide/AFSv2 100.html
`Liu DNS and BIND in Nutshell OReilly and
`Associates Sebastopol Calif 1992
`the Load on
`R.E McGrath What We Do andDon tKnowAbout
`the NCSA WWWServer Sept 1994 http//www.ncsa.uiuc
`
`edu/InformationServers/Colloquia/28.Sep.94/Begin.html
`Improving HTTP
`V.N Padmanabhan
`and J.C Mogul
`
`LatencyPrucSecondlnt1WWWC0nf.1994
`
`pp 995-1005
`http//www.ncsa.uiuc.edu/SDG/IT94/Proceedings/DDay/
`
`mogu/1-ITTPLatency.html
`
`10
`
`Spero Progress on HTFP-NG 1994 http//www7.cern.ch/
`
`hypertextJNWW/Protocols/
`
`1-ITTP-NG/http-ng-status.html
`
`11 J.E Pitkow and MM Recker
`Simple Yet Robust Caching
`Algorithm Based on Dynamic Access Patterns Proc Second
`Intl WWWC0nf
`l994pp 1039-1046 http//www.ncsa.uiuc
`edu/SDG/1T94/Proceedings/DDay/pitkow/caching.html
`
`Thomas Kwan is doctoral candidate in the Department
`at the Universiy of Illinois at Urbana-
`of ComputerScience
`Champaign and
`the
`graduate research assistant at
`National Center for Supercomputing Applications His
`research interests include parallel computing gigabit appli
`BS
`cations and World Wide Web technology He received
`degree in electrical engineeringfrom the Unive rsityof Wash
`ington and an MS degree in computer science from the Uni
`versity of Illinois at Urbana-Champaign He is member of
`IEEE ACM and Tau Beta Pi
`
`Robert McGrath is
`research programmer
`National Center for Supercomputing Applications
`His
`research centers on the architecture and performance of
`large-scale distributed systems He is
`coauthor of the book
`Web Server Technology to be published by Morgan Kauf
`mann in 1996
`
`the
`
`at
`
`professor in the Department of Com
`Reed is
`Daniel
`puterScience at the Univers ity of Illinois at Urbana-Cham
`the National
`paign where he holds ajointappointmentwith
`Center for Supercomputing Applications Reed received his
`BS degree in computer science from the University of Mis
`souri at Rolla in 1978 and his MS and PhD degrees also in
`computersciencefrom Purdue University in 1980 and 1983
`respectively He was
`recipient of the l987National Science
`Foundation Presidential Young InvestigatorAward
`
`Readers can contact
`
`the University of Illinois
`e-mail tkwanmcgrath.@ncsa
`at Urbana-Champaign
`uiuc.edu and reed@cs.uiuc.edu
`
`the authors at
`
`Petitioner IBM – Ex. 1068, p. 7