throbber
Homayoun
`
`Reference 39
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2151, p. 1
`
`

`

`A Self-Adapting Web Server Architecture: Towards Higher Performance and
`Better Utilization
`
`Khalid Al-Issa
`Farag Azzedin
`Information and Computer Sciences Department
`King Fahd University of Petroleum & Minerals
`Dhahran, Saudi Arabia
`fazzedin@kfupm.edu.sa
`khalid.issa@aramco.com
`
`ABSTRACT
`
`The way at which a Web server handles I/O operations has
`a significant impact on its performance. Servers that allow
`blocking for I/O operations are easier to implement, but ex-
`hibit less efficient utilization and limited scalability. On the
`other hand, servers that allow non-blocking I/O usually per-
`form and scale better, but are not easy to implement and
`have limited functionality. This paper presents the design
`of a new, self-adapting Web server architecture that makes
`decisions on how future I/O operations would be handled
`based on load conditions. The results obtained from our
`implementation of this architecture indicate that it is capa-
`ble of providing competitive performance and better utiliza-
`tion than comparable non-adaptive Web servers on different
`load levels.
`
`KEYWORDS: Operating systems; Internet and Web com-
`puting; Synchronous and asynchronous I/O; Concurrency
`
`1. INTRODUCTION
`
`Performance is a vital factor behind the success of Web-
`based services. As a result, working towards improv-
`ing the performance of Web servers becomes a critical
`issue. As a matter of fact,
`the tremendous growth of
`Web-based services and applications over the past several
`years, the growth of network bandwidth, and the pres-
`ence of a very demanding, large and growing commu-
`nity of Web users, are expected to put more heat on Web
`servers [16] [5] [2] [19] [18].
`In general, performance can be looked at as either macro
`or micro performance [20]. A Web server’s macro per-
`formance refers to the side of performance observed by
`clients, including throughput and response time. Micro per-
`formance, on the other hand, represents the server’s inter-
`
`nal performance, including lots of metrics like clock Cy-
`cles Per Instruction (CPI) and cache hit rate. While both
`of the two classes of enhancement contribute to the over-
`all performance of a Web server, they differ in complexity,
`significance and effectiveness. For instance, a simple ap-
`proach towards improving the overall performance is to use
`replication, which is precisely used to give better macro per-
`formance by providing multiples of the original throughput.
`This approach, however, only provides a workaround that
`would still suffer from the same set of issues that exist in
`the server architecture [19]. A more effective alternative
`would be to look into enhancing a Web server’s micro per-
`formance, which would not only improve the overall per-
`formance, but also allows for eliminating major limitations
`and defects. As an example, some earlier work has shown
`that a server would be able to satisfy its clients with bet-
`ter throughput and response time if its cache locality is im-
`proved [11], or if more clients requests are taken every time
`the server accepts new requests [5] [3].
`
`The primary task of Web servers is to deliver Web con-
`tents in a concurrent fashion. In order to deliver contents,
`frequent disk I/O operations have to take place, and that
`hits concurrency significantly. Concurrency is generally
`achieved either through asynchronous system calls to avoid
`blocking the server, or through multiple server instances us-
`ing threads or sub-processes in which case the use of syn-
`chronous system calls becomes acceptable. The former ap-
`proach is typically used in the Single-Process Event-Driven
`(SPED) Web server architecture. As shown in Figure 1, a
`client request is first accepted by the SPED server process.
`Then, the server performs all necessary computations as re-
`quested by the client. In case I/O operations are needed, the
`server would enqueue these I/O requests against an asyn-
`chronous system call like select(), which should relieve the
`server from having to block waiting for the I/O operation to
`be completed. While the I/O is being performed, the server
`may start processing computations associated with another
`
`978-1-4244-4907-1/09/$25.00 ©2009 IEEE
`
`96
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2151, p. 2
`
`

`

`replace available, less efficient system calls [21] [8] [12],
`and (3) suggest and implement enhancements to these two
`architectures [11] [2] [5] [1] [9] [13] [3], . What is com-
`mon about all the proposals we survey is that their technique
`for handling I/O operations is pre-determined, and can not
`adapt to the continuously changing state of the server. How-
`ever, there are situations in which allowing a server thread to
`block is more cost-effective. There are also cases in which
`a server thread is so expensive that it should be allowed to
`only perform computations, and no I/O. As a result, we be-
`lieve that a server should be allowed to determine what is
`best I/O scheme to follow based on load conditions.
`This research work proposes the algorithm and structure of
`a self-adapting, multi-threaded server architecture that has
`the ability to switch between two different I/O schemes de-
`pending on load conditions. To the best of our knowledge,
`this is the first proposed adaptive Web server model that can
`follow more than one I/O scheme.
`
`1.1. Motivation
`Enhancing the way at which a Web server handles I/O has
`been an active topic in literature over the last decade. The
`key motivation was to propose ideas for higher concurrency,
`despite the long latency of disk I/O. Hybrid architectures,
`which rely on combinations of the two original approaches,
`are among the most interesting proposed ideas. For in-
`stance, one of the first hybrid models suggests employing a
`pool of helper processes whose role is to perform I/O. This
`relieves SPED servers from the need for inefficient asyn-
`chronous I/O, and greatly increases these servers’ capacity.
`While different hybrid models employed different tech-
`niques for achieving the primary goal of improving perfor-
`mance through increasing utilization and scalability, they
`are common in that they enforce a fixed I/O scenario. Pro-
`posed hybrid models that allowed blocking I/O to take place
`would allow that even under overloaded conditions. Simi-
`larly, models that utilize helper processes would pass I/O
`requests to helper processes even under lower load condi-
`tions, in which case blocking might both faster and less of
`an overhead. This motivates us to propose a self-adapting
`Web server architecture and evaluate its effectiveness to out-
`perform non-adaptive architectures in both throughput and
`response time on different load conditions.
`
`1.2. Objectives
`
`The main objective of this research work is to outline two
`major limitations present in today’s widely used I/O model
`in Web servers, the synchronous blocking I/O model. Scal-
`ability is highly affected due to allowing server threads to
`block for I/O. This would consequently have negative ef-
`fects on a Web server’s overall performance as it limits
`
`Figure 1. The SPED Web Server Architecture
`
`Figure 2. The Multi-Threaded Web Server Architecture
`
`client request.
`
`While the SPED model works pretty well when serving
`cached contents, it is not efficient when contents are to
`be fetched from disk,
`in which case the server is ex-
`pected to interleave the serving of requests with slow I/O
`operations [16] [13].
`In addition, asynchronous system
`calls available for use to implement concurrency in this
`model causes the server process to actually block in cer-
`tain cases [16] [8]. The alternative for achieving con-
`currency through asynchronous calls is to run multiple
`server instances through either multiple processes or mul-
`tiple threads. As shown in Figure 2, in a multi-threaded
`server, a thread is responsible for processing all computa-
`tions as well as any needed I/O associated with the client
`request. While the multi-process and multi-threaded archi-
`tectures are rather easier to implement [13] [4], they exhibit
`relatively low utilization of a server’s resources since they
`allow processes and threads to block. In addition, the fact
`that every connection gets assigned a unique server thread or
`process has a negative impact on scalability. Moreover, the
`introduction of persistent connections in HTTP 1.1, which
`permits a connection to stay active while different objects
`are transferred, allowed for even less efficient utilization of
`the server [4].
`
`The fact that both of the two alternatives have signifi-
`cant limitations in concurrently handling I/O operations and
`clients requests has motivated researchers to: (1) come up
`with hybrid architectures that combine certain features of
`the two approaches [16] [19] [4], (2) implement libraries to
`
`97
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2151, p. 3
`
`

`

`throughput and increases response time. In addition, allow-
`ing server threads to perform blocking I/O operations rep-
`resents an inefficient utilization of this valuable resource.
`Utilization becomes a critical issue as the rate of incoming
`clients requests increases while server threads are idle over
`I/O.
`
`We still believe that blocking for I/O is the right choice un-
`der certain circumstances. Therefore, we propose in this pa-
`per a Web server architecture that can use both the blocking
`and the non-blocking I/O models under different load condi-
`tions to enhance performance and provide better utilization
`of server threads.
`
`1.3. Contributions
`
`This research work contributes to literature by first provid-
`ing a survey and classification of current Web server ar-
`chitectures. Over the past several years, a number of ar-
`chitectures have been proposed to overcome limitations in
`the original models, as well as to improve performance and
`cope with the increasing popularity of Web-based services.
`In this research, we present a survey of these models along
`with a classification that is based on how they handle I/O
`requests.
`
`Second, this research work brings to attention the need for
`adaptability in Web servers. This feature will enable the
`server to choose a more practical work scenario depending
`on past, current, or foreseeable circumstances.
`
`Third, this research work promotes the use of asynchronous
`I/O for multi-threaded servers. The highly improved scal-
`ability obtained with this technique compared to the very
`common contender justifies it very well.
`
`Last, this research work introduces a performance evalua-
`tion of an implementation of the proposed Web server ar-
`chitecture.
`
`The paper is organized as follows: Section 2 presents a
`survey and a classification of existing Web server architec-
`tures. In Section 3, we explain the I/O models and outline
`strengths and weaknesses of each one of them. Section 4
`describes the advantages of implementing self-adaptability
`in Web servers. Then, we describe the internals of the self-
`adapting Web server architecture in Section 5. In Section 6,
`we present the results obtained from our experiments in
`which we compare the performance of an implementation
`of the self-adapting Web server model to non-adaptive Web
`servers. We then explain how utilization is enhanced in the
`self-adapting Web server architecture in Section 7. Finally,
`Section 8 presents a conclusion of this paper, along with
`plans for future work.
`
`Table 1. Some Existing Web Server Architectures
`
`Year
`1999
`
`Contribution
`AMPED
`
`Class
`1
`
`2001
`
`Cohort scheduling
`
`2001
`
`SEDA
`
`2001 Multi-Accept
`
`2003
`
`Cappriccio
`
`2004
`
`Lazy AIO
`
`2005
`
`Hybrid
`
`2007
`
`SYMPED
`
`2008 MEANS
`
`3
`
`1
`
`3
`
`2
`
`2
`
`1
`
`1
`
`2
`
`Remarks
`This is a SPED server that passes
`I/O requests to helper processes
`or threads
`In this server, the order of exe-
`cuting threads is changed in or-
`der to execute similar compu-
`tations consecutively, which re-
`duces cache misses.
`A pipelined server that consists
`of multiple stages, each is asso-
`ciated with a pool of threads.
`Instead of accepting a single in-
`coming connection, a bulk of
`incoming connections are taken
`every time accept() is called.
`A multi-threaded package that
`uses asynchronous I/O and pro-
`vides high scalability.
`A new asynchronous I/O library
`that is meant to resolve issues
`with the available asynchronous
`libraries.
`A multi-threaded server that em-
`ploys an event-dispatcher to re-
`solve issues with allowing per-
`sistent HTTP connections.
`This server employs multiple
`SPED instances.
`A software architecture that uses
`micro-threads
`for
`scheduling
`event-based tasks to Pthreads.
`
`2. LITERATURE REVIEW
`
`The multi-threaded and the event-based architectures are the
`original approaches for implementing a server. As each of
`the two has its own limitations and areas for improvement,
`many proposals over the last several years came to sug-
`gest and implement enhancements to overcome limitations
`and improve performance. We classify these proposals into
`three classes: (1) proposals for hybrid architectures, (2) pro-
`posals that suggest replacement libraries, and (3) proposals
`that disregard the I/O issue and focus on other aspects to
`improve performance. Table 1 summarizes our classification
`of the available Web server architectures.
`
`2.1. Proposals for Hybrid Approaches
`
`The first class of these proposals focused on deriving hy-
`brid architectures that would combine features from the
`two original models. One of the early attempts was the
`asymmetric multi-process event-driven (AMPED) architec-
`ture [16], which provides a more effective solution for per-
`forming I/O operations in SPED servers, in which asyn-
`chronous system calls like select() are used. The AMPED
`
`98
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2151, p. 4
`
`

`

`architecture is similar to SPED in that it typically runs a
`single thread of execution. However, instead of perform-
`ing asynchronous I/O using descriptors through select(),
`AMPED passes I/O operations to helper processes. As a
`result, if blocking for I/O ever takes place, only helper pro-
`cesses will have to set idle and not the server’s process.
`A more recent hybrid model is the Staged event-driven ar-
`chitecture (SEDA) [19].
`In SEDA, the entire work-flow
`of processing requests is re-structured into a sequence of
`stages, which makes it very similar to a simple pipeline.
`The main motivation behind introducing SEDA is to pro-
`vide massive concurrency by allowing pools of threads to
`handle specific sets of tasks at the same time.
`D. Carrera et. al. [4] proposes a solution for the case in
`which an idle client continues to hug a server thread, which
`is an undesirable consequence of allowing persistent con-
`nections in HTTP/1.1 for multi-threaded servers. To re-
`solve this issue, they introduce a hybrid model in an event-
`dispatcher is used to identify sockets with readable contents
`and assign them to a server thread, which would read and
`process the request.
`D. Pariag et.
`al [17] proposes the Symmetric Multi-
`Processor Event Driven (SYMPED) architecture.
`The
`SYMPED model consists of multiple SPED instances work-
`ing together to increase the level of concurrency. Whenever
`one of these instances blocks for disk accesses, other in-
`stances can take over processing clients requests.
`2.2. Proposals for Replacement Libraries
`The second class of proposals started from the fact that
`available asynchronous system calls are not efficient, and
`suggested that they should be replaced. For instance, the
`way select() works requires it to block when used on disk
`I/O [16] [8].
`Proposals in this area focus on provid-
`ing replacements to these limited libraries. Elmeleegy et.
`al. [8] proposed Lazy Asynchronous I/O (LAIO), an asyn-
`chronous I/O interface to better support non-blocking I/O
`that would be more appropriate for event-driven program-
`ming. LAIO basically provides a non-blocking counterpart
`for each blocking system call. It handles blocking I/O oper-
`ations for the application setting on top, while the applica-
`tion is allowed to move on with processing other requests.
`Capriccio [1] is a scalable thread package that has the ability
`to scale up to 100,000 threads. This package was designed
`to resolve the scalability issue of the multi-threaded archi-
`tecture. This high scalability in this solution was achieved
`through the use of epoll(), an asynchronous I/O interface
`that has proven to perform better than both the select() and
`the poll() interfaces with the right optimizations [9].
`Lei et. al. [12], proposes MEANS, a micro-thread software
`
`architecture that consists of two thread-layers setting be-
`tween the application and the operating system. An applica-
`tion that makes use of MEANS will assign work to MEANS
`micro-threads, which assign tasks in an event-based sce-
`nario to Pthreads interacting directly with the operating sys-
`tem.
`
`2.3. Proposals for Modifying Other Architectural
`Components
`
`The last class of proposals focused on enhancing the way
`the original models operate, while allowing the same I/O
`scenarios to take place. Chandra et. al. [5] and Brecht et.
`al. [3] suggest modifying the way a Web server accepts new
`connections. In the case of SPED, instead of accepting only
`a single new connection every time the server checks for in-
`coming connections, they suggest accepting multiple con-
`nections [5]. This method increases the rate of accepting
`new connections, and increases concurrency of the server
`by providing more work that is ready to be processed at any
`instance.
`Larus et. al. [11] suggest enhancing Web server perfor-
`mance by increasing locality and minimizing cache misses.
`They proposed a server model in which different requests
`are analyzed to identify similar computations. The order
`at which these requests are processed would be altered to
`allow similar computations to be processed consecutively.
`Executing similar computations as a group increases local-
`ity, and consequently improves performance.
`
`3. I/O MODELS
`Whenever an I/O operation needs to be performed by a Web
`server, the operating system actually takes care of it. This is
`due to many reasons, including maintaining a layer of secu-
`rity through which only privileged applications are granted
`access to certain files. While the I/O operation is being car-
`ried out, a server could either be blocked waiting for it to
`complete, or is free to process other requests. This depends
`entirely on the type of I/O the server initiated.
`
`3.1. Synchronous I/O Operations
`
`Synchronous I/O could be blocking or non-blocking to the
`calling process [10]. While both are performed through the
`same read and write system calls, the non-blocking requires
`the “O NONBLOCK” option to be set when the open() sys-
`tem call is issued. The main issue with the synchronous
`non-blocking model is that it would require the calling pro-
`cess to send numerous calls to get the status of the requested
`I/O. As a result, this model is known to be extremely ineffi-
`cient [10].
`The synchronous blocking model, on the other hand,
`
`99
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2151, p. 5
`
`

`

`is widely used in multi-threaded and multiprocess Web
`servers, where the presence of multiple instances of the
`server makes the undesirable blocking less significant.
`There are, however, major issues with this I/O model. Par-
`ticularly for multi-threaded Web servers, allowing threads
`to block would largely limit the server scalability, lead-
`ing to degraded performance with low throughput and high
`response time.
`In addition, allowing a server threads to
`block introduces idle time and makes inefficient utilization
`of valuable resources.
`
`3.2. Asynchronous I/O Operations
`
`Just like the case with synchronous I/O, asynchronous I/O
`can be either blocking or non-blocking [10]. A well-known
`example of the asynchronous blocking I/O scheme is the se-
`lect() system call [7]. Through this system call, the calling
`process can add new asynchronous I/O requests and take the
`ones that are complete for processing. The select() system
`call keeps a list of file-descriptors for every request it is pro-
`cessing. Every time a select() is issued, the calling process
`is expected to block for a period of time to allow some I/O
`to complete.
`
`In the asynchronous non-blocking system calls, the calling
`process returns immediately after initiating an I/O request.
`This class of I/O operations is performed through system
`calls present in the Asynchronous I/O (AIO) library, includ-
`ing the aio read() and aio write(). Once an AIO operation
`is initiated, it gets carried out to completion for the calling
`process, which would need to use some notification mecha-
`nisms to identify completed I/O requests. Two of the widely
`used notification mechanisms include signals and polling.
`In signals, the kernel would send a signal to the calling pro-
`cess once its requested I/O operation is complete. In polling,
`on the other hand, a dedicated thread is typically used to
`“poll” the status of a list of I/O requests passed to him by
`the calling process.
`
`While in synchronous blocking I/O the calling process has
`a limit on how many requests it is processing at a given
`time, asynchronous I/O enables the calling process to ac-
`cept much more work. This is largely due to the fact
`that a worker thread is relieved from having to handle a
`client requests to completion. This provides a boost to
`the server’s scalability, and consequently enables the server
`to sustain higher throughput with lower response time. In
`addition, utilization of server threads is kept high since
`they are devoted for processing and not for sleeping over
`I/O. There are, however, issues with the asynchronous non-
`blocking approach. There are situations where blocking is
`more preferable. Under lower workloads, as well as for
`shorter I/O requests, blocking becomes a more convenient
`alternative and is expected to perform better. In addition,
`
`the need for notification mechanisms both complicates the
`rather simple I/O scenario and introduces an overhead that
`can have significant drawbacks. Last but not least, there are
`limits to how many AIO requests a kernel allows at a given
`time [6].
`
`4. THE NEED FOR SELF-ADAPTABILITY
`
`Existing Web servers follow a fixed I/O scheme at all times,
`regardless of the fact that different I/O operations are of
`varying cost, and that they are performed at different load
`conditions. For instance, if we look at a server that allows
`blocking I/O to take place, then we know that concurrent I/O
`requests would cause several threads to block until the I/O is
`complete. While this is acceptable under lower load condi-
`tions, it becomes undesirable as the server gets overloaded
`as that would highly limit the scalability of the server. The
`inefficient utilization is also an issue here since the server
`is allowed to set idle over I/O while other queue is building
`up.
`The basic alternative of allowing only non-blocking I/O to
`take place has limitations too. While this may significantly
`enhance utilization and improve performance, it introduces
`the overhead of notification that might not be well justi-
`fied under low load. For instance, polling is a notification
`technique in which an I/O request - initiated by a server‘s
`thread - is regularly checked for completion by a polling
`thread. When this server is under lower load conditions,
`then this I/O scenario becomes quite overwhelming. As a
`matter of fact, it would be more practical and reasonable to
`allow threads to block in such a case rather than introduce
`the overhead of asynchronous I/O.
`This brings to attention the need for a Web server to adapt to
`different work scenarios under different load conditions in
`order to ensure best possible performance. More precisely,
`the use of blocking I/O is very practical whenever incom-
`ing requests are within a Web server’s capacity. Beyond the
`server capacity, the use of non-blocking I/O increases scal-
`ability, giving better chances for higher performance, and
`utilizes server threads highly and more appropriately.
`
`5. THE SELF-ADAPTING WEB SERVER
`ARCHITECTURE
`
`The self-adapting Web server architecture is similar to the
`multi-threaded Web server model, with the distinction that
`it allows for two types of I/O: synchronous blocking, and
`asynchronous non-blocking I/O operations. Synchronous
`I/O takes less effort from the server itself, but lead to lim-
`iting both utilization and scalability. Asynchronous I/O
`comes with its own overhead, but is very useful at times
`when we need the best scalability and utilization of server
`
`100
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2151, p. 6
`
`

`

`Clients requests
`
`Clients requests
`
`main() thread
`
`main() thread
`
`write
`to client
`
`Pool of
`worker
`threads
`
`Web contents
`stored at local
`disks
`
`polling
`thread
`(idle)
`
`[a]
`
`write
`to client
`
`Pool of
`worker
`threads
`
`write
`to client
`
`polling
`thread
`
`Web contents
`stored at local
`disks
`
`[b]
`
`Figure 3. The Self-Adapting Web Server Work Scenario
`
`resources. The criteria of adaptability can follow more than
`one pattern, as long as it includes well defined and easily
`detectable conditions. For example, in our implementation
`we use the presence of queued requests as an indicator of
`overload.
`In our implementation of the self-adapting server model,
`we used the following pools of threads explained in [15]:
`a pool of 64 worker threads, and a pool of a single polling
`thread for detecting completed asynchronous I/O requests.
`While the use of asynchronous I/O is not very common in
`multi-threaded servers, it has many advantages. With syn-
`chronous I/O, the maximum number of requests to be ser-
`viced at anytime would equal the number of threads. How-
`ever, using asynchronous I/O, worker threads are able to
`initiate I/O requests and move on to servicing other client
`requests, and that would greatly improve scalability. The
`default I/O scheme for our server is the synchronous block-
`ing, and the server would only switch to using asynchronous
`I/O as load increases (see Figure 3).
`As can be seen in Figure 3 client requests are expected to
`get assigned to worker threads almost instantly under lower
`load conditions, without having to wait for long periods of
`time in the queue. As the rate of incoming requests in-
`creases, less worker threads become free and we get to a
`point at which all worker threads are busy processing client
`requests. At this point, incoming requests start piling up
`in the queue of the thread pool, and the server would no-
`tice that and instruct worker threads to rather perform the
`next I/O requests as asynchronous non-blocking operations.
`This will continue to be the case until all queued requests
`are assigned to worker threads, and then the server may re-
`sume the default working scenario of using synchronous I/O
`scheme.
`Figure 4 shows a flowchart of the algorithm used to build our
`server. The upper-most block in the diagram, labeled (a),
`
`accept()
`
`(a) main() thread
`
`(b) worker threads
`
`enqueue request into queue
`of thread-pool
`
`dequeue request, perform
`header parsing, ready for I/O
`
`check for outstanding requests
`in pool's queue
`
`any outstanding
`requests?
`
`no
`
`yes
`
`perform Blocking I/O
`read from "le, and write
`back to client
`
`perform Asynch. I/O
`initiate a read request, and
`enqueue for polling
`
`worker thread is done,
`marked as free
`
`(c) polling thread
`
`polling thread dequeues an AIO
`read request, and check its status
`
`yes
`
`AIO read
`completed?
`
`no
`
`polling thread writes
`back to client
`
`enqueues back the
`AIO read
`
`Figure 4. Adaptability Algorithm in the Self-Adapting
`Web Server Architecture
`
`represents the work done by the main thread. This thread is
`basically the server’s dispatcher, which binds to the listen-
`ing port and continuously assign incoming requests to the
`queue of the threads-pool. The block in the middle of the
`diagram shows the work done by a worker thread. First, the
`worker thread dequeues a work item from the queue, and
`starts processing it. When it gets to the point where it needs
`to perform I/O, the worker thread will have to check for the
`presence of outstanding requests in the queue, and if there
`are none, then it would perform a blocking I/O operation.
`In case the queue is not empty, however, then that is a good
`indicator that all other worker threads are busy and that the
`server is about to be - or already is - overloaded. As a re-
`sult, the worker thread would initiate an asynchronous I/O
`request, and would enqueue a work item associated with it
`for the polling thread. That moves us to the third block in
`the diagram, labeled (c), which represents the work done by
`the polling thread. Once the polling thread dequeues this
`work item, it would be able to identify the I/O request in
`
`101
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2151, p. 7
`
`

`

`charge, and would be able to check on its status. In case
`the I/O is found to be complete, then the polling thread will
`respond back to the client, otherwise it enqueues back for
`later.
`
`5.1. Implementation
`
`The self-adapting Web server architecture employs both
`synchronous and asynchronous system calls for making I/O.
`For synchronous I/O, it uses the system‘s read() and write().
`For asynchronous I/O, it uses the Asynchronous I/O (AIO)
`set of library calls, which is included in the 2.6 Linux ker-
`nel. The main reason we elected to use this library is that it
`includes true non-blocking calls like the aio read(), which
`our worker threads use to initiate an asynchronous I/O be-
`fore it moves on to servicing another request.
`The programming language in use for implementing the
`server was C, and the code was compiled using GCC.
`
`6. PERFORMANCE EVALUATION
`In order to evaluate the performance of the self-adapting
`Web server model, we compare it to two non-adaptive
`servers we developed. The three Web servers we developed
`differ only in the way I/O is performed, while everything
`else is kept the same. In addition, we compare our model to
`a similarly configured version of the Apache server, which
`represents the state-of-the art in Web servers. The goal of
`comparing the servers’ performance to Apache is to pro-
`vide a sense of how it performs compared to a fully opti-
`mized, production-ready server such as Apache, under the
`same workload.
`
`6.1. Performance Metrics
`
`Web servers performance is usually measured in terms of
`both throughput and response time. Throughput of a Web
`server is given by:
`
`T hroughput = R
`T
`
`(1)
`
`where R represents the number of successfully completed
`requests, and T represents the total time that was needed
`to complete them. Throughput provides a measure of how
`many requests a Web server can successfully complete in a
`unit of time. The other metrics, response time, represents
`the time between the completion of a request and the begin-
`ning of a response, and is often measured in milli-seconds.
`
`6.2. Testing Environment
`
`Besides the self-adapting Web server, which we explained
`in Section 5, we implemented a multi-threaded server that
`
`is blocking I/O (BIO)-based. This server is composed of
`a main thread, and a pool of worker threads. The main
`thread binds itself to the listening port and then works as
`a dispatcher assigning incoming clients requests to worker
`threads, which are responsible for processing all requests
`to completion. This processing includes parsing the HTTP
`header, validating the request, performing blocking I/O re-
`quests to obtain the requested Web files, and sending the
`responses back to clients.
`
`We also implemented a multi-threaded server that is asyn-
`chronous I/O (AIO)-based. In addition to the main thread
`and the pool of worker threads available in the BIO-based
`server, the AIO-based server has a dedicated polling-thread
`whose role is to verify completion of initiated I/O requests.
`In this server, the main thread works similar to the one of
`the BIO-based server, but the way worker threads work dif-
`fers. The rule of thumb here is that no worker thread is
`allowed to perform I/O operations. As a result, when an
`I/O operation is needed, a worker thread only initiates it and
`then is freed up for processing another client request. Of
`course, somebody else will have to follow up with the I/O
`operation that has been just initiated, and this is the job of
`the polling thread. More precisely, worker threads use the
`aio read() call to initiate an I/O read operation. Since this
`call is non-blocking, the thread is expected to return from it
`immediately. Just before it is allowed to to go back to the
`pool, the worker thread enqueues a request for the polling-
`thread to track the initiated I/O operation. After that, the
`polling-thread will dequeue the request and check the status
`of its associated I/O operation. If the I/O is complete, the
`polling-thread will respond back to the client. Otherwise, it
`would enqueue the request back to be re-checked later.
`
`Apache is a well-known Web server that follows the origi-
`nal multi-threaded model, with blocking I/O operations per-
`formed as needed. In this experiment, Apache was reco

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket