`
`Scalable World Wide Web Server on Multicomputers
`
`Daniel Andresen
`
`Tao Yang Vegard Holmedahl Oscar
`Department of Computer Science
`University of California
`Santa Barbara CA 93106
`dandrese tyang veho ibarra cs .ucsb.edu
`
`Ibarra
`
`Abstract
`
`We investigate the issues
`World Wide Web WWW server on
`
`involved in developing
`
`scalable
`
`cluster of workstations and
`
`parallel machines
`
`The objective is to strengthen the processing
`server by utilizing the power of multicom
`capabilities of such
`puters to match huge demands
`in simultaneous access requests
`from the Internet We have implemented
`system called SWEB
`theMeiko CS-2 andnetworked
`on adistributedmemorymachine
`The scheduling component of the system actively
`workstations
`monitors the usages of CPU I/O channels
`and the interconnection
`network to effectively distribute HITP requests
`across processing
`task and I/O parallelism We present the experi
`mental results on the performance of this system
`
`units to exploit
`
`Motivation
`
`The Scalable Web server SWEB project grew out of the needs
`of the Alexandria Digital Library ADL project at UCSB
`library systems which provide the on-line retrieval and
`Digital
`processing of digitized documents through Internet have increas
`The Alexandria
`topic of national
`importance
`on the design implementation
`
`ingly turned into
`is focused
`
`Project
`
`and deployment
`
`of
`
`distributed digital
`The collections of
`
`library for spatially-indexed information
`
`the library currently involve geographically
`referenced materials such as maps satellite images digitized
`and associated metadata
`aerial photographs
`rapid prototype
`library has already been devel
`system for the Alexandria digital
`oped which allows users
`to browse spatially-indexed maps and
`broad user commu
`To expose the system to
`images
`nity the next version of the system available in early 1996 will
`to the World Wide Web WWW using the Hypertext
`be connected
`Transport Protocol HTTP
`Our work is motivated by the fact
`that the Alexandria digital
`library WWW server has
`to become the bottleneck in
`potential
`delivering digitized documents over high-speed Internet Popu
`lar WWW sites such as the Lycos and Yahoo
`receive
`day For WWW-based network in
`involve
`formation systems such as digital
`libraries the servers
`CPU activities
`To
`much more intensive I/O and heterogeneous
`meet such demands the SWEB project makes use of existing net
`worked computers and inexpensive disk resources
`to strengthen
`capabilities of WWW servers We
`
`over one million accesses
`
`the processing and storage
`
`1996 IEEE
`1063-7133/96 $5.00
`Proceedings of IPPS 96
`
`850
`
`that using the idle cycles of those processing units and
`expect
`retrieving files in parallel from inexpensive disks can significantly
`large amount
`
`in response to
`
`improve the scalabiity of the server
`of simultaneous HTFP requests
`Numerous other initiatives to create high-performance HTTP
`servers have been reported The lnktomi server atUC Berkeley is
`based on the NOW technology
`NCSA
`has built
`multi-workstation HITP server based on round-robin domain
`name resolution tINS to assign requests to workstations
`The
`is effective when HTTP requests access
`round-robin technique
`HTML information of relatively uniform size and the load and
`computing powers of workstations are relatively comparable Our
`the computing powers of workstations
`and
`assumption is that
`parallel machine resources
`They can be
`can be heterogeneous
`used for other computing needs and can leave andjoin the system
`resource pool at any time Thus scheduling techniques which are
`to the dynamic change of system load and configuration
`adaptive
`are desirable The DNS in
`round-robin fashion cannot predict
`those changes Another weakness of the technique is the degree of
`local DNS
`name caching which occurs
`tINS caching
`enables
`so that most
`system to cache the name-to-IP address mapping
`recently accessed hosts can quickly be mapped The downside is
`period of time from DNS servers domain
`that all requests
`will go to
`IP address
`particular
`We have
`the preliminary version of
`WWW server SWEB
`running on Meiko CS-2 dis
`network of workstations NOW
`tributed memory machine and
`such as SUN and DEC machines Each processing unit is capable
`request following the HTTP protocol
`The
`user
`of handling
`distinguishing feature of SWEB is effective resource utilization
`by close collaboration of multiple processing units Server scala
`by actively monitoring the run-time CPU disk
`is achieved
`bility
`110 and network loads of system resource units and dynami
`cally scheduling user HTTP requests to
`processing
`Our dynamic scheduling scheme is closely related to the pre
`vious work on load balancing on distributed systems for which
`In these studies tasks
`collection of papers is available in
`arrivals may temporarily be uneven among processors and the goal
`of load balancing is to adjust the imbalance between processors
`by appropriately transferring tasks from overloaded
`processors to
`underloaded processors The task resource requirements are Un
`known and the criteria for task migration are based on
`single
`system parameter i.e the CPU load We call them single-faceted
`
`for
`
`developed
`
`scalable
`
`proper node for efficient
`
`Petitioner IBM – Ex. 1010, p. 1
`
`
`
`scheduling strategies WW applications
`there are multiple
`the system performance including CPU
`parameters that affect
`loads interconnection network performance and disk channel us
`ages The optimal HITP request assignment
`does
`to processors
`on Cpu loads
`Thus we need to develop
`not solely depend
`scheduling scheme
`that can effectively utilize the
`
`multi-faceted
`
`system resources by considering the aggregate
`
`impact of multiple
`re
`
`resource
`
`gives the back
`
`In
`
`approaches
`
`for building
`
`experimental
`
`parameters on system performance
`quirements are predicted and suggested to guide the load sharing
`They mention multiple factors but utilize only the CPU factor in
`predicting response times
`The paper is organized as follows Section
`ground and problem definition Section
`the possible
`discusses
`scalable WEB server and the SWEB
`organization and load balancing strategies Section
`presents the
`results showing the perfnruiance of SWEB and coin-
`discusses conclusions
`pares our approach with others Section
`and future work
`
`1.1.1.1
`
`__
`
`Figure
`
`simple HT1P transaction Client
`the address of server
`sends over request
`
`looks up
`
`and receives
`
`response
`
`number of tests using high-
`
`example the NCSA has performed
`in their working environment
`and discovered
`end workstations
`approximately 5-it rps could be dealt with using the NCSA httpd
`which cannot match the current and future loads
`server
`library server Thus multiple servers are needed
`scalable performance
`for achieving
`Our overall objective of the system is to reduce and sustain
`the response time under large numbers of simultaneous requests
`Goals considered in designing this system are twofold First
`we hope to demonstrate how to utilize existing inexpensive com
`modity networks heterogeneous workstations
`and disks to build
`scalable WWW server Second we attempt
`to develop dy
`namic scheduling algorithms for exploiting task and I/O paral
`lelism adaptive to the run-time change of system resource loads
`The scheduler needs
`and availabilities
`to incorporate multiple
`system performance parameters to assign user HTTP requests
`proper unit for efficient processing while adding minimal over
`head
`
`Background
`
`The World Wide Web
`
`e.g
`
`digital
`
`First
`
`The World Wide Web is based on three critical components
`Locator URL the HyperText Markup
`the Uniform Resource
`Language HTML and the HyperText Transfer Protocol HrrP
`The URL defines which resource the user wishes to access
`HTML language
`allows the information
`to be presented
`but still well-formatted manner
`arLd the
`platform-independent
`HYIP protocol
`is the application-level mechanism for achieving
`the transfer of information
`simple HTTP request would typically
`from initiation
`to completion as shown
`of events
`in Figore
`from the URL and
`the client determines the host name
`uses the local Domain Name System DNS server
`to determine
`its IP address The local DNS may not know the IP address of
`the DNS system on the
`the destination and may need to contact
`destination side to complete the resolution
`After receiving the
`TCP/IP connection
`IP address the client then sets up
`to well-
`known port on the server where the HTTP process is listening The
`request is then passed in through the connection
`After parsing
`code e.g 202 in
`the server
`sends hack
`the request
`stands for OK File found and 404 is File
`the HYIP protocol
`not found followed by the results of the query The connection
`is then closed by either the client or the server
`
`the
`
`in
`
`activate
`
`sequence
`
`response
`
`SWEB
`
`scalable WWW server
`
`if
`
`We call
`system scalable
`the system response
`requests is kept as small as theoretically possible when
`individual
`the the number of simultaneous HTIP requests increases while
`high peak
`low request drop rate and achieving
`maintaining
`request rate We define the response time fo request as the
`length of the time period from when
`request is initiated until all
`
`time for
`
`the requested
`
`information sTrives at the client
`server there is an upper bound for the
`For
`single-workstation
`number of requests per second rps that
`the server can handle For
`
`851
`
`to
`
`There are several performance factors that affect the response
`time in processing H1TP requests These include processor load
`to send bytes out on the network
`caused by the overhead necessary
`and marshaled disk I/O which limits how
`properly packetized
`the local network
`quickly data can be delivered to the processor
`over which remote data requests must be fetched and the internet
`to the client which often is
`connection
`severe bottleneck
`
`above first
`
`Our goal is to get
`
`information out of the server
`
`the requested
`Thus our scheduler primarily monitors
`the
`as fast as possible
`three performance factors The Tnternet bandwidth in
`in request re-direction Our scheduling
`formation is used partially
`algorithm is multi-faceted in the sense that the proper decision on
`routing HITP requests needs to aggregate the impact of the above
`multiple system parameters on the overall system response time
`We will present the architecture for SWEB and its scheduling
`algorithm next
`
`3.1
`
`SWEB Architecture
`
`There are two approaches
`
`in designing the scheduler One is
`
`to have
`centralized scheduler running on one processor such
`that all H1TP requests
`go through this processor The scheduler
`monitors the usages of all system resources makes assignment
`decisions based on this information and routes requests
`to appro
`priate processors We did not take this approach mainly because
`
`Petitioner IBM – Ex. 1010, p. 2
`
`
`
`acct_requeStr
`
`choose_servers
`ifstsme
`reroute_requests
`
`else
`
`handle requeotr
`
`Broker
`
`servers
`
`massage
`
`choose server
`
`for request
`
`_______________________
`Lgisubusealoadiniii
`
`SWEBogftalserver
`
`httpd
`
`Ii
`
`Ioadd
`
`Figure2 The computing and storage architecture ofSWEB
`DNS directs tasks to servers running task scheduler
`
`Figure
`
`The functional modules of
`
`SWEB scheduler
`
`in
`
`processor
`
`the single central distributor
`
`becomes
`
`single point of failure
`
`system load parameters affect such
`
`decision In
`
`single-faceted
`
`via
`
`making the entire system more vulnerable
`The current version of SWEB uses
`distributed scheduler
`are first evenly routed to SWEB processors
`The user requests
`the DNS rotation as shown in Figure
`The rotation on avail
`able workstation network IDs is in
`round-robin fashion
`This
`is available in current DNS systems The major ad
`functionality
`vantages of this technique are simplicity ease of implementation
`and reliability
`The DNS assigns the requests without consulting dynamically-
`changing system load information Then SWEll conducts
`fur
`Each processor in SWEB contains
`ther assignment of requests
`scheduler and those processors collaborate with each others
`exchaesge system load information After
`request is routed to
`processor via DNS the scheduler in that processor makes
`decision regarding whether to process
`this request or assign it
`Two approaches URL redirection or request
`another processor
`forwarding could be used to achieve reassignment and we use
`the former Request
`to implement
`forwarding is very difficult
`URL redirection gives us excellent compatibility
`within
`with current browsers and near-invisibility to users Any HTlP
`request is not allowed to be redirected more than once to avoid
`
`to
`
`to
`
`the ping-pong effect
`The functional
`
`structure of the scheduler
`
`depicted in Fig
`
`It contains
`
`at each processor is
`based on NCSA
`bttpd daemon
`broker module
`httpd code for handling httpd requests with
`which determines the best possible processor to handle
`given
`request The broker consults with two other modules the oracle
`and the loadd The oracle is miniature expert system which
`u.ser-supplied table to characterize the CPU and disk de
`uses
`The loadd daemon is responsible
`mands for
`task
`particular
`for updating the system CPU network and disk load information
`and marking those processors
`periodically every 2-3 seconds
`which have not responded in
`preset period of time as unavail
`able When
`joins the resource pool
`processor leaves or
`loadd daemon will be aware of the change
`
`the
`
`3.2
`
`The multi-faceted scheduling algorithm
`
`In this subsection we present he algorithm that decides where
`HTTP request should be routed As we discusted before several
`
`852
`
`scheduling system processor can be classified as lightly loaded
`and heavily loaded based on one parameter e.g CPU load One
`purpose of such
`classification is to update load information only
`when
`overhead
`to reduce
`
`classification changes
`
`it
`
`its
`
`broadcasting
`
`unnecessary
`In our problem context
`is hard to classify
`e.g
`load
`processor as heavily or lightly loaded since there are several
`light CPU load but
`processor could have
`parameters
`local disk may receive many access requests from the network file
`system Thus each processor in SWF.B has
`load daemon to detect
`its own CPU disk and network load periodically
`such information to other processors In our experiments we will
`show that such overhead
`is insignificant
`We address how multiple system load parameters can be used
`in deciding the assignment of H1TP requests
`Since
`time for each request we
`is to minimize the response
`our goal
`design the heuristic based on the estimated cost
`for processing
`each request using the following formula
`
`together
`
`ediest4ios
`
`tdaa
`
`tcypr
`
`t055
`
`tdjeet4on is the cost
`to redirect
`the request to another pro
`tdo is the time to transfer the required data
`cessor if required
`from the disk drive or from the remote disk if
`the file is not local
`tp is the time to fork
`process perform disk reading to handle
`HTIP request plus any known associated
`computational
`the request is CGI operation
`is the cost for transferring the
`the Internet We discuss
`and model
`
`cost
`
`if
`
`these
`
`processing results over
`
`individual
`
`cost
`
`terms as follows
`
`tdato
`
`Requested
`
`file size
`
`I5
`hie size
`Requested
`ninbj .td1b.5 xdz
`
`If the file is local the time required to fetch the data is simply
`the file size divided by the available bandwidth of the local
`We also measure the disk channel
`storage system
`there are many requests the disk transmission
`performance degrades accordingly
`
`load ft
`
`If
`
`If
`
`the data is remote then the file must be retrieved through
`the interconnection network
`The local network bandwidth
`burt and load f2 must be incorporated
`
`Experimentally
`
`Petitioner IBM – Ex. 1010, p. 3
`
`
`
`10% penalty for
`we found on the Meiko approximately
`remote NFS access and on the SUN workstations connected
`the cost increases by 50%-70%
`by Ethernet
`No of operations required
`CPU
`
`tapu
`
`server the estimated load on
`
`the
`
`term estimates the amount of processing tune re
`The tap
`quired to complete the task This is based on the speed of the
`destination node CPEII02d
`and the estimated number of operations required for
`task The computation requirement
`for
`particular
`request
`is estimated by the oracle component of the system see
`The parameters for different architectures are
`Figure
`
`file
`
`that some
`It should be noted
`saved in configuration
`estimated CPU cycles may overlap with network and disk
`time and the overall Cost may be overestimated sLightly but
`estimation works well in our experience
`this conservative
`The load estimation of remote processors is based on the
`periodic updating of information given by those ramolte pro
`cessors
`is possible that
`processor Pc
`is incorrectly
`believed to be lightly loaded by other processors and many
`To avoid this unsynchro
`requests will be redirected to it
`nized overloading we conservatively increase the CPU load
`of
`This strategy is found to be effective in
`by
`30%
`We use
`
`It
`
`ne5
`
`Jnse...cegLired
`np.5_handaidth
`
`to return
`This term is uscd to cstiinate the time necessary
`the network When the
`the results back
`to thc client over
`scheduler compares processors we assume all processors
`for this term so it
`will have basically the same cost
`is not
`estimated Our research goal is to produce the query result
`for HTFP request as fast as possible in the server site
`
`does not exist or is
`already determined to be
`retrieval of information then the request is always
`not
`completed ate Then the request is passed to the broker for
`analysis The broker then
`
`redirection
`
`Determines the server on whose
`resides if any
`
`local disk the file
`
`Calculates an estimated time for each available server-
`node for request
`
`Having determined the estimated time for each server
`the broker
`indicates its choice determined by
`the request
`the minimum time to completion to the main process
`
`to fill
`
`Redirection
`
`If
`
`the chosen server is not
`
`the request
`
`is
`
`redirected appropriately
`
`Fulfillment At this point
`in the
`the request
`is processed
`normal HYIP server manner with any CGIs executed
`needed and any client responses
`shuttled out the appropriate
`socket
`
`as
`
`It should he noted that modeling the cost associated with pro
`HTTP request accurately is not easy We still need to
`cessing
`function Our experiments
`investigate further the design of such
`show that
`the current cost
`function does reflect
`the impact of
`multiple parameters on the overall system response performance
`and that SWEB delivers acceptable
`performance based on such
`system re
`heuristic function adapting
`to dynamically-changing
`source loads In our experiments we will show that the overhead
`of SWEB scheduling is insignificant
`compared to the system load
`used for request fulfillment and other activities
`
`iredjrectjon
`
`tcjjt_servgr iatncy
`
`cemnect
`
`3.3 Performance Analysis
`
`This term measures the time necessary
`request from one server to another
`This is set
`the estimated latency of the connection
`
`to move rrp
`
`to twice
`
`between the server
`
`and the client
`
`plus the time for
`connection tconej
`The conceptual
`server to set up
`very short reply going back to the client
`model
`is that of
`browser who
`issues another
`then automatically
`request
`the new server address The transfer time is zero if
`the task
`to the target server The estimate of the link
`is already local
`latency is available from the TCP/IP implementation
`but in
`
`to
`
`We provide an analysis on the maximum sustained
`rps which
`ate achievable within our schema We use the following paranie
`is the number of nodes
`is the theoretical max sustained
`ters
`
`the bandwidth of
`is the average file size requested
`rps
`local disk accessing b2 is the bandwidth of remote disk accessing
`in prepro
`is the average redirection probability
`is the
`is the overhead
`of redirection and
`cessing
`request
`average processing time for each request Then we can show the
`maximum sustained rps for file fetches is
`
`is overhead
`
`is
`
`the initial
`
`implementation is hand-coded into the server
`
`____
`
`AdA0
`
`at processor
`
`the sched
`
`Given the arrival of HTTP request
`uler at processor
`goes through the following steps
`the HTTP com
`The server parses
`Preprocess
`request
`mands and completes the pathname given determining ap
`propriate permissions along the way
`It also determines
`whether the requested document exists has moved or is
`COl program to execute
`Analyze request The server then determines whether itself
`It does so by
`or another server should fulfill
`the request
`the results from the preprocessing phase If
`
`checking
`
`is
`
`in
`
`The full analysis is discussed
`5MB/s and b2
`4.5MB/s
`For example if
`2.88 then the maximum sustained rps is 17.3 for
`nodes We will show this is close to our experimental
`results in
`Section
`
`SWEB currently
`focuses on GET and related commands used in the
`but SWEB
`Otherconimands e.g POST arenothandled
`IlTfPprotocol
`could be extended to do so in the future
`
`853
`
`Petitioner IBM – Ex. 1010, p. 4
`
`
`
`Experimental Results
`
`4.1
`
`Scalability of overall performance
`
`file sizes requested
`
`it
`
`high-
`
`approximately
`
`Maximum rps on Meiko CS-2 and NOW The first experi
`ment was run to determine how many requests per second SWEB
`could process This depends on the average
`and the number of nodes In
`is reported that
`end workstation running NCSA httpd could fulfill
`one-node NCSA httpd 1.3 server per
`rps We examine how
`forms and compare it with the 6-node SWEB on Meiko CS-2 and
`the 4-node SWEB on NOW The maximum rps is determined by
`fixing the average file size and increasing the rps until
`requests
`start to fail which indicates that the system limit is reached The
`duration of the test which simulates the burst of simultaneous re
`quests also affects the experimental results The requests coming
`But
`short period can be queued
`and processed
`gradually
`
`in
`
`in
`
`be
`
`is
`
`120 seconds
`
`that
`
`single-
`
`long period cannot
`the requests
`continuously generated
`queued without actively processing them since there are new re
`quests coming after each second We use two types of
`tests One
`duration of 30 seconds
`and at each second
`short period as
`are launched The long period has
`constant number of requests
`in order to obtain the sustained maximum rps
`the maximum rps of
`It can be seen from Table
`by the multi-node server
`The maximum rps
`than that
`long
`period because
`short period can he
`queued For NOW results with 1.5MB file size 11 rps is reached
`for sustained maximum
`is achieved
`for duration of 30s but only
`rps This is because the maximum disk and Ethernet bandwidth
`limit is reached for this case These results have been confirmed
`via the analysis in Section 3.3 which gave an analyticalmaximum
`SM files on the Meiko consistent with
`sustained
`17.8 rps
`for
`the 16 rps achieved
`in practice It should be noted that in practice
`come in periodic bursts
`Thus we use
`
`to
`
`network
`
`testbed consists of Meiko CS-2
`Our primary experimental
`distributed memory machine at UCSB Each node has
`scalar
`chip with 32MB of RAM
`40Mhz SuperSparc
`processing unit
`running Solaris 2.3 We mainly use six CS-2 nodes each of which
`GB hard drive on which the test flies
`is connected
`dedicated
`reside Disk service is available to all other nodes via NSF mounts
`via modified fat-tree network with
`These nodes are connected
`peak bandwidth of 40MB/s Our secondary
`testbed is
`of SparcStation LXs running Solaris 2.4 connected
`via standard
`Each LX has
`local 525 MB hard drive and
`10 Mb/s Ethernet
`16MB of RAM The effective bandwidth of this Ethernet
`is low
`is shared by other UCSB machines
`since it
`The complete SWEB system is based on NCSA httpd 1.3
`the distributed functional
`source with modifications
`to support
`ity We plan to move to version
`library built on the Solaris TCP/IP
`All software uses the sockets
`the built-in TCP/IP library
`The use of
`
`.5 in the immediate future
`
`streanis implementation
`was
`
`deliberate decision based cii several
`
`factors including the
`
`following
`
`to use current WWW/HITP
`We wanted
`Gompatibility
`compatible clients for testing purposes as well as our own
`programs Current browsers typically use TCP/IP and sock
`ets or the equivalent
`
`Portability With an implementation built on standard
`any UNIX system our soft
`libraries available on virtually
`heterogeneous NOW with no
`ware can easily be moved to
`modifications
`
`speeds near
`
`The primary disadvan
`There are of course disadvantages
`we were only able to achieve ap
`tage is certainly performance
`proximately 5-15% of
`the peak communication performance on
`the Meiko where DMA transfea handled by the built-in Elan
`communications
`the peak
`co-processor can achieve
`We also felt
`bandwidth
`code complexity was
`that additional
`caused by not using the built-in messaging libraries such as NXJZ
`or Active Messages
`We ran
`
`series of tests where
`
`burst of requests would ar
`simulating the action of
`rive nearly simultaneously
`graphical
`number of simultaneous con
`browser such as Netscape where
`are made one for each graphics image on the page
`nections
`The clients were primarily situated within UCSB We also tested
`SWEB via requests
`from the East coast of the US Rutgers Uni
`versity to examine the benefits of scheduling in high network
`on the UCSB data on the seal-
`latency situation We concentrate
`ability of the system believing this more accurately reflects the
`high-bandwidth networks we anticipate
`
`It should be noted that the results we report are average per
`formances by running the same tests multiple times The test
`system loads
`performance is affected
`since the machines are shared by many active users at UCSB
`We report our experiments to examine the performance of our
`system in the following aspects
`scalability of overall perfor
`and overhead costs
`
`mance scheduling strategy comparison
`
`by dynamically-changed
`
`is improved significantly
`The speedups
`
`node server
`
`obtained in
`
`vary for different
`short period is much higher
`
`file sizes
`
`in
`
`requests
`
`accumulated in
`
`requests
`
`30-second test
`
`period in the rest of experiments representing
`
`non-trivial but
`
`Table
`
`limited burst of requests
`Response time and drop rate on Meiko CS-2 and NOW In
`we report the response time the time from after the client
`request until the completion of this request when we vary
`sends
`the number of server nodes The system starts to drop requests
`file requests 1K the
`the server
`reaches
`its rps limit For small
`multi-node server performs much better than one-node server but
`the response remains constant when using
`processors This is
`because none of
`limits on bandwidth
`the theoretical or practical
`
`if
`
`for small
`
`tiles
`
`to
`
`The re
`or processing power have been reached
`certain degree after the number of
`sponse time does increase
`which reflects the overhead
`processors exceeds
`required by the
`scheduling algorithm and distributed file system
`For relatively large files 1.SMB the processing time is sub
`stantially longer When the number of processors increases the
`SWEB provides substantially better performance
`These results
`are consistent with those of NCSA and also strongly confirm
`loads
`distributed server solution
`the notion that under heavy
`can achieve significant speedups Under especially heavy loads
`which would tend to occur during peak hours at popular sites
`tends to drop almost half or more of the connections
`single server
`made whereas
`
`distributed server might have
`
`average response
`
`time and fill every request
`
`854
`
`larger overall
`The superlinear
`
`Petitioner IBM – Ex. 1010, p. 5
`
`
`
`file sizes
`
`Single server
`SWEB
`
`1K
`
`45
`
`82
`
`Short period 30s
`NOW
`Meiko
`1.5M
`
`1.SM
`
`1K
`
`Sustained 120
`NOW
`Meiko__J
`1.5M
`1.5M
`
`1K
`
`24
`
`76
`
`11
`
`45
`
`16
`
`45
`
`10
`
`16f28
`
`Table
`
`Maximum rps for
`
`test duration of 30s and 120s on Meiko CS-2 and NOW
`
`that
`
`the total size of roam
`
`speedup we obtain reflects the fact
`ory in SWEB is much larger than on
`one-node server and that
`the multi-node server accommodates more requests within main
`memory while one-node server spends more time in swapping be
`tween memory and the disk Additionally the network overhead
`is distributed among multiple servers rather than concentrated
`single node
`
`at
`
`42 Comparison between
`strategies
`
`different scheduling
`
`is to
`
`results in
`
`Our scheduling strategy takes into consideration the locality of
`files and also the current resource loads We compare
`requested
`one is the NCSA approach
`that uni
`our approach with others
`lousily distributes requests to nodes round robin another
`the file locality by assigning requests to the nodes
`purely exploit
`that own the requested files We present experimental
`set of files with uniform and non-uniform sizes
`retrieving
`Request with non-uniform file size With non-uniform file
`by the initial DNS
`sizes the load distribution between processors
`assignment is heterogeneous We expect
`that the round-robin ap
`proach cannot adapt such load variations We tested the ability of
`the system to handle requests with sizes varying from short ap
`100 bytes to relatively long approximately 1.5MB
`proximately
`shows the actual
`time in seconds for this case
`For lightly loaded systems SWEB performs comparably with the
`20 SWEB has an
`others For heavily loaded systems rps
`advantage of 15-60% over round robin and file locality
`
`Table
`
`response
`
`RPS
`Round Robin
`
`File Locality
`SWEB
`
`16
`
`2.4
`
`2.5
`
`2.6
`
`20
`
`5.2
`
`4.4
`
`3.5
`
`24
`
`6.9
`
`7.9
`
`5.9
`
`32
`
`15.3
`
`9.3
`
`6.2
`
`1.5
`
`1.7
`
`1.7
`
`Performance under non-uniform requests 1.5MB
`Table
`file size 30 sec duration 0% drop rate Meiko CS-2
`
`It
`
`file locality end also consider CPU
`to exploit
`is important
`load We performed
`skewed test
`to illustrate the fundamental
`weakness of the file locality heuristic where each client accessed
`the same file located on
`single server effectively reducing the
`In this situation round-robin
`
`single server
`parallel system to
`handily outperforms file locality with and average response times
`This test was performed with six
`of 3.7s and 81.4s respectively
`rps for 45s and file size of 15MB
`
`servers
`
`server
`
`to
`
`We also ran clients on the east coast Rutgers University New
`performance gain of over 10%
`Jersey arid the tests results show
`using file locality instead of round robin from an Ethernet-linked
`the poor bandwidth and long latency over the
`in spite of
`from the east coast
`to the west coast
`connection
`Requests with uniform fiks size For uniform file sizes the
`DNS scheme assigns HLTP requests evenly to nodes which leads
`homogeneous distribution of CPU load but the network com
`munication load is not minimized We expect that the advantages
`high communication cost
`of the re-scheduling will manifest
`environment On Meiko CS-2 we conducted
`several
`tests and
`the result shows
`that three strategies
`have similar performance
`This is because NFS is implemented in Meiko CS-2 through the
`tree network and the chance of network contention
`fast Elan fat
`
`in
`
`is much smaller than on Ethernet
`relatively slow bus-type
`Ethernet in NOW environment
`the advantage
`of exploiting file
`locality is more clear This is verified in an experiment shown in
`
`In
`
`Table
`
`RPS
`Round Robin
`
`File locality
`SWEB
`
`48.1
`
`128.3
`
`243.7
`
`42.25
`
`41.6
`
`124.2
`
`124.8
`
`219.0
`
`210.4
`
`Table
`
`File size
`
`Performance under uniform requests
`.5MB 0% drop rate
`
`on NOW
`
`4.3 Overhead of SWEB
`
`Overhead distribution from the client point of view SWEB
`re
`code was instrumented to determine the overhead
`cost
`for
`1.5MB file fetched over
`shows the case of
`quest Table
`fairly
`heavily loaded system The results indicate that the overall over
`head introduced by SWEB analysis and scheduling algorithm is
`insignificant For small files approximately 1K the data transfer
`time is smaller but the overhead of SWEB compared to the pre
`processing time for parsing HTTP commands is still very small
`The direct cost of analysis is typically about 1-4 ms
`for cost
`redirection if necessary Indi
`estimation and
`iris to generate
`rect costs which affect
`redirection
`the time required to complete
`depend heavily on the latency between
`client and server For
`5M file on the Meiko of the 5.4 sec total time
`client fetching
`well over 90% is spent doing data transfer
`Overhead distribution from the server point of view From
`
`855
`
`Petitioner IBM – Ex. 1010, p. 6
`
`
`
`Meiko
`
`nodes
`
`lKfiletime
`1K drop rate
`l.5M file time
`
`1.5Mdroprate
`
`9.8
`0%
`
`0.40
`0%
`
`0.41
`0%
`
`0.44
`0%
`
`0.49
`0%
`
`72.5
`98.2
`37.3% 5.0%
`
`30.7
`23.7
`16.5
`3.5% 3.5% 3.5%
`
`ft
`
`IJ
`
`ft
`
`0.62
`0% ft
`9.9
`0.0% if
`
`NOW
`
`10.9
`0%
`
`0.83
`0%
`
`600
`
`201.4
`
`20.5%
`
`0.40
`0%
`
`170.6
`0%
`
`0.26
`0%
`
`167.6
`0%
`
`Table
`
`Performance in terms of response times and drop rates On Meiko CS-2 16 rps 30 sec duration On NOW 16
`.5M file The duration is 30s Time units are in seconds averaged over all requests Single server
`rps for 1k file
`rps for
`test timed out after no responses were received
`
`Activity
`Preprocessing Req
`Analysis SWEB
`Redirection SWEB
`Data Transfer
`
`Network Costs
`
`Total Client Time
`
`Time
`70 msec
`or msec
`
`msec
`
`4.9 sec
`
`0.5 sec
`
`5.4 sec
`
`Table
`
`.5M
`Cost distribution in average response time
`file size Meiko CS-2 Items marked SWEB are intro
`duced by the SWEB system All other
`standard HTIPd functionality
`
`times are due to
`
`the servers point of view we need to examine how much load the
`SWEB contributes for scheduling decisions and resource
`infor
`mation collection Our data shows that
`in processing requests