throbber
Homayoun
`
`Reference 5
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2117, p. 1
`
`

`

`IEEE TRANSACTIONS ON COMPUTERS, VOL. 40, NO. 8, AUGUST 1991
`
`963
`
`Design and Analysis of Master/Slave Multiprocessors
`
`Albert G. Greenberg, Member, IEEE, and Paul E. Wright
`
`system
`
`(kernel)
`
`user
`
`Fig. 1. Process modes.
`
`Abstract— In certain computer operating systems, including
`Unix,1 a process alternates between two modes: user mode and
`system mode. The process enters system mode on executing
`a system call, say, to do I/O, and returns to user mode on
`completion of the call. In master/slave multiprocessors, certain
`system calls, which we refer to as kernel calls, can execute only
`on the master processor. All other work can execute on the
`master processor or on any slave processor. As a result, the kernel
`calls from different processes are serialized, which (barring more
`fundamental changes to the operating system) is necessary since
`kernel calls from different processes are not independent; for
`example, the calls might update the same table.
`A simple model of master/slave processors is presented, along
`with two simple, practical scheduling algorithms. An approximate
`analysis of the model yields simple formulas for performance
`measures in terms of the hardware and workload parameters, and
`gives insight into the power and the limitations of master/slave
`systems. In particular, we obtain formulas for the maximal
`processing power (throughput) of the system, a quantity that
`remains bounded as the number of slave processors increases.
`Our analysis is also applicable to symmetric multiprocessors,
`where performance considerations such as cache performance,
`may dictate asymmetric assignment of system tasks to the pro­
`cessors.
`Index Terms—Master/slave, multiprocessors, performance eval­
`uation, scheduling.
`
`I. Introduction
`TN certain computer operating systems, including Unix, a
`1 process alternates between two modes: system mode and
`user mode. System mode is entered when the process makes
`a call to the operating system, say, to do I/O. A simple
`way to support several processes running concurrently on a
`multiprocessor, with minimal changes to the operating system
`itself, is to designate one processor as the master and the others
`as the slaves [2], [5]. A subset of the system calls, which we
`refer to as the kernel calls (See Fig. 1), can execute only on the
`master. When a process running on a slave processor makes a
`kernel call, the slave returns the process to the master, rather
`than service the call. As a result, the kernel calls are serialized,
`which (barring more fundamental changes to the operating
`system) is necessary since kernel calls from different processes
`are not independent; for example, the calls might update the
`same table. On the other hand, while not executing kernel calls,
`processes are independent, and can execute on any processor.
`We present a simple stochastic model of master/slave mul­
`tiprocessors, and two simple, practical scheduling algorithms.
`An approximate analysis of the model yields simple formulas
`for performance measures in terms of hardware and workload
`1 Unix® is a registered trademark of AT&T Bell Laboratories.
`Manuscript received April 4, 1989; revised October 17, 1990.
`The authors are with AT&T Bell Laboratories, Murray Hill, NJ 07974.
`IEEE Log Number 9101665.
`
`parameters. Monte Carlo simulation results show that the
`approximations are accurate. The results give insight into the
`power and the limitations of master/slave systems. In partic­
`ular, we obtain formulas for the maximal processing power
`(throughput) of the system. These parameters are possible to
`control in system design, though the control may entail difficult
`recoding of some of the kernel calls. Our results quantify the
`benefits of the recoding. Conversely, the results show that in
`many cases, without such recoding, it is fruitless to add more
`slave processors to the system.
`Goble and Marsh [5] appear to have been the first to
`argue that the master/slave approach offers a simple and
`cheap way to extend significantly the power of an exist­
`ing uniprocessor, in their case the VAX 11/780. A more
`recent example is the AT&T 3B2 1000 model 80, a Unix
`master/slave multiprocessor. Further motivation for the study
`of asymmetric multiprocessors stems somewhat surprisingly
`from symmetric multiprocessors, where all processors have
`the same capabilities. In practice, it is often desirable and
`sometimes necessary for performance reasons, to partition the
`operating system functions for these machines so that some
`functions are restricted to execute only on certain processors.
`Doing so can improve the processors’ cache performance,
`which is critical to overall system performance. If all calls
`to a service are restricted to execute on only one processor,
`the corresponding code and data may be partly captured and
`held in this processor’s cache, causing the code to run faster
`and generate less traffic between the processors and main
`memory. For example, assigning one processor to handle
`interrupts from a given peripheral device not only simplifies
`the interrupt handling routine, but also may improve its
`performance because the appropriate code and data are more
`likely to persist in cache between interrupts. The model
`considered here, however, makes it clear that this approach
`is not without pitfalls.
`Our analytic approach is well-known in probabilistic mod­
`eling as fixed point analysis [3], [4], [6], [9], [14] or the
`fixed population mean method [16]. Often, a fixed point
`approach culminates in a numerical search for solutions, with
`no guarantee that the solutions are unique. In this paper, the
`approach yields unique, simple closed form solutions.
`We believe that this work reveals the essential poten­
`tials and limitations imposed by scheduling constraints on a
`0018-9340/91/0800-0963$01.00 © 1991 IEEE
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2117, p. 2
`
`

`

`964
`
`master/slave multiprocessor architecture. However, the model
`presented here, trades off some realism for simplicity:
`• We assume that the multiprocessor supports shared mem­
`ory, and do not model bandwidth limitations between
`processors and memory. In practice, this bandwidth limits
`the number of processors the multiprocessor can support
`at a high level of utilization per processor. Our model and
`its analysis expose additional limitations (if any) arising
`from scheduling constraints and workload parameters.
`• We assume that the central random variables of interest,
`such as job lengths, are exponentially distributed and
`that there is just one class of job in the system. In
`particular, we assume that each kernel model demand a
`job makes is exponentially distributed. A cursory glance
`at measurement data collected during the development
`of the AT&T 3B2 1000 Model 80 shows that this expo­
`nential assumption is not unreasonable, although a closer
`examination reveals that the tail does indeed contain
`slightly too much mass.
`In an influential paper, Leland and Ott published mea­
`surement data on Unix job characteristics [12]. Leland and
`Ott considered a Unix job’s total CPU service demand
`and found that if the probability distribution function that
`best fit their data was of the form F(x) = l—rx~c, and an
`exponential distribution gave a much poorer fit. Krueger
`and Livny [11] later found that a close fit to the Leland
`and Ott data is obtained by using a hyperexponential
`distribution, specifically, a mixture of three exponential
`distributions. This is not surprising since the Leland and
`Ott workload data showed distinct job classes: CPU hogs,
`disk hogs, and the remaining short jobs.
`In short, while it is difficult to fit a single exponential
`distribution to a multiclass Unix workload, a mixture
`of exponentials gives an adequate fit. In this paper, we
`consider an analytic closed model where there is a single
`class of jobs that are never replaced so the total CPU
`demand is infinite. However, as in [12] and [11] if it
`is deemed crucial to model a multiclass workload, we
`recommend adapting the analysis to hyperexponential dis­
`tributions. Our analysis is useful in the design phase to see
`how workload parameters impact system performance.
`Though the analytic results presented here do not address
`multiclass workloads, the model described in Section II
`is easily extended to capture multiple classes (replacing
`the exponential distributions with hyperexponential dis­
`tributions), which can then be studied via simulation and
`via our analytic approach as described in Section VII.
`• The exponential assumptions lead to simple formulas
`that provide useful guides to practical design of master/
`slave systems. The formulas identify saturation condi­
`tions, which must be confronted to obtain a balanced
`design. In particular, the formulas give the number of
`slave processors that can be usefully supported, as a
`function of the workload parameters.
`The scheduling algorithms considered here arose during
`the design and implementation of the AT&T 3B2/1000
`model 80, a Unix master/slave multiprocessor. Measure­
`ment of the model’s parameters revealed that some kernel
`
`IEEE TRANSACTIONS ON COMPUTERS, VOL. 40, NO. 8, AUGUST 1991
`
`call recoding would be essential. This gave focus to
`the development, and significant performance gains were
`noted during the recoding, which our model and analysis
`helped to explain.
`Some of the formulas derived follow just from flow
`balance conditions, and do not depend on the exponential
`assumptions. In Section VI, we discuss a simple way,
`suggested by our analysis, to extract a key performance
`statistic essentially via measurement alone. Moreover,
`the analysis can be adapted to handle other distribu­
`tions (including deterministic distributions), at the cost
`of replacing the formulas with numerical solutions.
`• Last, we believe that our modeling approach and analysis
`can be generalized to study scenarios where there is no
`one master processor, but a system function may be
`restricted to execute only on one processor, or only on
`a subset of the processors. This would lead to better
`understanding of asymmetric uses of multiprocessors with
`symmetric capabilities, alluded to above.
`
`II. The Model
`We assume that there is one master processor, P >1 slave
`processors, and a fixed number of jobs in the system, N >1.
`The slave processors run continuously at unit speed. Let
`speed of master processor
`P = speed of slave processor
`The assignment of jobs to processors depends on the sched­
`uling algorithm.
`Our model of a job is simple. We lump system mode work
`that can be executed on a slave processor (reentrant system
`calls) with real user mode work and refer to all such work as
`user mode work. Henceforth, no mention is made of system
`mode. In our model, a job alternates between user and kernel
`mode. On entering user mode, the job demands to be served
`in user mode for a random period of time. We call this period
`of time a holding time. After obtaining that service, during
`sojourns on master or slave processors, the job switches to
`kernel mode and demands service in that mode for another
`random period of time. This demand can be satisfied only
`during sojourns on the master. After the demand is satisfied,
`the job switches back to user mode. Note that the term holding
`time refers to the random service demand for service in a
`given mode, not the time interval during which this service
`is delivered.
`The key parameters of our job model are the holding times
`in kernel and user mode. The holding time in a given mode
`is defined as a random amount of time that a job requires
`service in this mode before switching to the other mode. Thus,
`a holding time is a demand for CPU service, which is satisfied
`over a number of sojourns on the processors. We assume that
`the holding times are independent, exponentially distributed
`random variables, with means
`Tu = L( length of user mode period)
`
`and
`
`Tk = //(length of kernel mode period).
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2117, p. 3
`
`

`

`GREENBERG AND WRIGHT: MASTER/SLAVE MULTIPROCESSORS
`
`We assume that each time that a job is started on a processor,
`a context switch cost is paid. Specifically, the job demands
`service for a random period with mean
`Tc = /^(length of context switch period)
`before demanding the service associated with its mode. We
`assume the context switch periods are independent and ex­
`ponentially distributed. We emphasize that this is a modeling
`assumption only and is done for the purposes of making the
`analysis tractable. Obviously, the simple model of a job studied
`here is an idealization, but it contains sufficient structure to
`expose scheduling difficulties intrinsic to master/slave systems.
`Interval Timeouts: An integral part of all timesharing sys­
`tems, such as Unix, is an interval timer, which periodically
`generates an interrupt signal which is simultaneously sent to
`all processors [2], The interval timeout period is the time
`quantum used for timesharing. In our model, we assume
`that the sequence of interval timeout periods are independent,
`exponentially distributed random variables, with mean
`Tj = £ (length of interval timeout period).
`In any real system, the interval timeout period has determin­
`istic, rather than exponentially distributed length. However,
`owing to the random mixing arising from our other stochastic
`assumptions, the results do not appear to be very sensitive to
`the shape of this distribution. We have conducted experiments
`in which both the context switch cost and the interval timeout
`are deterministic. These experiments indicate that the trends
`predicted by the exponential assumptions are correct. In ad­
`dition, the values of the observed performance measures are
`in close agreement with those predicted by our analysis. This
`issue is taken up again in Section V.
`Scheduling Opportunities: There are exactly two classes of
`events, which we term scheduling opportunities, which can
`trigger a processor to interrupt one job to run another:
`• when its currently running job switches from user to
`kernel mode or from kernel to user mode, and
`• when the system’s interval timer goes off.
`In Unix systems, a switch from user to kernel mode is signaled
`by an exception or illegal instruction trap, which switches
`control to an interrupt handler. The interrupt handler can take
`scheduling actions just before and just after starting the kernel
`mode service.
`Parameter Ranges: Reasonable parameters ranges are Tc =
`0.4 ms, Tj = 10 ms, Tk
`0.1 to 2 ms, and Tu — 1 to
`100 ms. In practice, the interval timeout period Tj is difficult
`to change, because a large number of periodic system tasks are
`set to be performed on fixed multiples of this period. The wide
`range for Tu reflects the range in workloads. Roughly, high
`Tu workloads consist of computationally intensive jobs, and
`low Tu workloads consist of interactive or I/O intensive jobs.
`To achieve a high level of performance, we would like Tk
`to be small and Tu to be large. In most systems, Tk is the
`cost of a single kernel call, and is difficult to reduce. However,
`Tu can be increased by recoding kernel calls so that they can
`execute on the slave processors. (Recall that Tu lumps real
`user mode work with system calls executable on the slave
`
`965
`
`processors.) Typically, recoding a kernel call is difficult since
`data structures and algorithms have to be reworked so that
`several instances of the call can proceed concurrently. Our
`analysis shows how large Tu must be increased to reach a
`given level of performance.
`
`3)
`
`III. Scheduling Algorithms
`In this section we present two scheduling algorithms: kickto
`and sampler. Roughly, kickto moves jobs from master to slave
`as fast as possible. The idea is to prevent starvation of the
`slaves, though if jobs switch between user and kernel modes
`rapidly kickto may waste an excessive number of cycles due
`to context switching. The idea behind the sampler algorithm
`is to reduce this danger significantly by moving a job from
`master to slave only if the master finds the job in user mode
`on interval timeout. The drawback for sampler is the potential
`for underutilizing the slave processors.
`The kickto and sampler algorithms have the following
`properties in common.
`Priority is given to jobs in kernel mode. That is, when
`1)
`the master is in position to start a new job, it always
`chooses one in kernel mode if there is one. Though
`this makes good sense from a performance viewpoint,
`it is desirable for more basic reasons. As described in
`[2], jobs in kernel mode are likely to hold scant system
`resources, so Unix systems assign such jobs high priority
`to free those resources as soon as possible.
`2) Jobs not running on some processor wait in one of two
`queues: either the master queue or the slave queue. Jobs
`in the master queue are all in kernel mode and jobs in
`the slave queue are all in user mode. A slave processor
`can take jobs from the slave queue only. The master
`processor can take jobs from either queue.
`When a job running on a slave processor switches to
`kernel mode, the processor stops running the job and
`places it on the master queue. Next, if the slave queue
`is not empty, the processor takes a job from the slave
`queue and runs it. Otherwise, the processor idles until
`a job arrives in the slave queue, whereupon it takes the
`job and runs it.
`Whenever the master processor is in position to select
`a job to run, if the master queue is not empty, then
`the master processor chooses a job from that queue. If
`the master queue is empty but the slave queue is not,
`then the master processor chooses (steals) a job from
`the slave queue.
`At every interval timeout, every processor stops its
`currently running job, and attempts to obtain another,
`whereupon the processor first holds for a context switch
`period and then runs the job. Thus, in our model the
`context switch cost is charged even if it turns out that
`the processor obtains the same job as it had before the
`interval timeout.
`We insist on service interruption and context switch­
`ing on every interval timeout in order to model the
`overhead of timesharing, though this may be slightly
`pessimistic. In practice, the timeout may trigger system
`
`4)
`
`5)
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2117, p. 4
`
`

`

`IEEE TRANSACTIONS ON COMPUTERS, VOL. 40, NO. 8, AUGUST 1991
`
`IV. Performance Analysis
`In this section, we present an approximate steady-state anal­
`ysis of the model, for a workload determined by parameters
`Tu and TK. The analysis provides simple formulas for the
`processor utilizations and the average queue lengths. (Indeed,
`the queue length distribution and higher queue length mo­
`ments are readily computable.) The formulas reveal important
`differences between sampler and kickto, which we bring out
`numerically in Section V. Moreover, the formulas allow us
`to determine precisely in terms of the parameters where the
`bottleneck lies: whether it is the master queue or the slave
`queue that saturates as the number of jobs increases. In
`particular, we find exactly how large the number of slave
`processors P must be, with all other parameters held fixed,
`before the system becomes “master-limited,” meaning the
`master queue is overloaded and adding more slave processors
`is fruitless.
`In order to streamline the statement of the results and avoid
`unduly complicated formulas, we shall first state the results
`obtained assuming Tc = 0 (Section IV-A), then describe how
`to scale the parameters and results to account for context
`switching (Section IV-C). The error made by this scaling is
`essentially that of replacing a two stage Erlang distribution (the
`first stage models holding for an exponential period with mean
`Tc and the second holding for another exponential period
`with mean depending on Tu, Tk, Tj) with an exponential
`distribution having the same mean. There is no difficulty in
`handling the two stage Erlang distribution in our analysis, but
`it leads to more complicated formulas without compensating
`benefit.
`Our analysis is approximate and consists of analyzing the
`master and slave subsystems separately as if each receives
`an external stream of jobs at unknown Poisson rate A, which
`depart after receiving service. We solve for the average number
`of jobs QS(A) and Qm{X) in the slave and master subsystems,
`observed at a random moment in (continuous) time in steady
`state. As to be expected, QJ. X) and <5m(A) increase as A > 0
`increases and diverge at finite (saturation) values of A. We
`determine A as the smallest positive root Ajv of the “fixed
`point” equation
`
`©(A) + Qm(A) — N,
`
`which states that the average number of jobs in the two
`(approximate) subsystems must equal the given number of
`jobs N in the real system. We also derive simple closed
`forms for the equilibrium distribution of the number of jobs in
`each subsystem as functions of A a-. Thus, having determined
`Ajv, we can derive simple closed forms for essentially any
`performance measure.
`At steady state, the (unknown) rate at which jobs flow from
`master to slaves must equal the rate in the reverse direction.
`This motivates analyzing each subsystem as if each receives
`jobs at the same rate A. However, it does not follow from
`the assumptions of Section II that the flow is Poisson and is
`independent of the detailed state of the other subsystem. This
`is an approximation.
`
`Slave
`queue
`
`c I I I U U
`
`S
`
`0
`
`966
`
`Master queue
`
`K
`
`KK
`

`
`s
`Fig. 2. Flow of jobs in a master/slave multiprocessor.
`
`maintenance tasks that cause an overhead to be felt even
`when the result of the timeout is to reschedule the job
`that was running before the timeout.
`The flow of jobs in the system is illustrated in Fig. 2. The
`mechanism for selecting one job from a queue holding at least
`one job need not be modeled, since our performance estimates
`deal only with the lengths of the queues.
`
`A. Kickto
`In kickto, whenever the currently running job on the master
`processor goes into user mode, if the master queue is not
`empty, then the master processor stops the job, places it in
`the slave queue and runs another, selected from the master
`queue. Thus, the master processor kicks jobs out to the slave
`queue as soon as possible. If the master queue is empty when
`the currently running job enters user mode, then the master
`processor continues to run the job. If a job subsequently
`migrates from a slave processor to the master processor (to
`execute in kernel mode) while the user mode job is still
`running on the master, the user mode is preempted and placed
`in slave queue and the (newly arrived) kernel mode job is run
`on the master.
`
`B. Sampler
`If jobs switch modes rapidly (Tu and Tk are on the order
`of Tc), then under kickto the master processor is swamped
`by context switching. The following algorithm, sampler, tries
`to overcome this problem by not moving jobs to the slave
`queue as readily. Under sampler, the master processor takes
`scheduling actions only on interval timeouts. At an interval
`timeout, if the master queue is not empty and the master
`processor finds its currently running job in user mode, then
`it stops the job, places it in the slave queue, and runs another
`selected from the master queue. Thus, the master processor
`samples the mode of its current job at all interval timeouts.
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2117, p. 5
`
`

`

`GREENBERG AND WRIGHT: MASTER/SLAVE MULTIPROCESSORS
`
`The close agreement between the analytic results obtained
`via these approximations and simulation results show that this
`error is small. (Of course, the simulation results are for the
`model of Section II, without the approximations introduced
`here.)
`
`A. Analysis I: No Context Switch Cost
`The utilization of a processor is the steady-state average
`proportion of time the processor spends servicing user or
`kernel model demands, as opposed to idling. Letting Um
`denote the utilization of the master processor and Us the
`utilization of a given slave processor, define the system’s
`processing power to be
`W = Um + PUS
`which is proportional to the steady-state rate at which the
`system executes instructions. Let Qm and Qs denote the
`number of jobs (queued and in service) at the master and at
`the slaves, respectively.
`Suppose that there is no context switch cost, so Tc = 0.
`When the master queue is empty and the master moves its
`currently running job to the slave queue, it steals the job
`back and restarts it immediately without any context switch
`overhead. It is as if the job never left service. It follows that
`the master processor is never idle, so
`Um = l.
`A simple drift argument (given below for sampler) shows
`that
`
`(4.1)
`
`X* =
`
`if using sampler,
`pTu
`TvTK+nTiiTu+TK)'
`if using kickto
`Tk
`is the maximal throughput of the master subsystem. It follows
`from (4.1) that A* is always smaller for sampler than for kickto.
`A renewal theoretic argument shows that A* = [i/Tk holds for
`kickto regardless of the distributions of the random variables.
`The justification of (4.1) for sampler is as follows. Assume
`that the master queue contains an infinite number of jobs. Thus,
`the processor is never starved for work, and outputs jobs at
`the highest possible rate in this case. Let us regard the master
`as being in either user mode or kernel mode according to the
`mode of the job currently being executed. Under sampler, this
`“state” of the master processor evolves approximately as a
`Markov chain with two states: kernel and user. Let tt” and 7rfc
`denote the equilibrium distribution of this chain. Since the job
`in service goes from user to kernel mode on service completion
`(rate n/Ty) or interval timeout (rate l/I)), and from kernel
`to user mode on service completion (rate ^/Tk), the balance
`equation is
`
`7TU I
`= 77^
`^
`\Tu
`Tk
`Tj
`which has the normalized solution, i.e., tt11 + nk = 1,
`b-TuTi
`^.U __
`TT =
`TuTk + pTi(Tu + Tk)
`
`(4.2)
`
`(4.3)
`
`967
`
`(4.4)
`
`7Tfc =
`
`Tk{Tu +
`TuTk + nTi{Tu + Tk)
`Jobs depart only if caught in user mode on an interval timeout.
`Thus, the maximal throughput is txu/Tj. Unraveled, this is the
`expression for A*.
`The detailed analysis of the slave subsystem is classical, and
`is given first. The analysis of the master subsystem depends
`on the scheduling algorithm. We treat sampler first and then
`kickto.
`1) Slave Subsystem in Isolation: Assuming that jobs arrive
`to the slave queue at Poisson rate A and depart after receiving
`service (at exponential rate l/Tu), the slave subsystem is a
`classical M/M/P system [10]. Letting Xs denote the number
`of jobs present at equilibrium, we obtain [10]
`
`Pr[Xs = k\ =
`
`I{XTu,P)(^)%
`
`0 < fe < P,
`k> P.
`
`(4.5)
`
`The mean population in the system is
`(\TV)P I(XTu,P)
`(P - 1)!
`(P- XTuf
`
`Qs(X) = E[Xs] = XTu 1 +
`
`(4.6)
`
`where I(X, P) is given by (4.28). It is intuitively obvious and
`is easily demonstrated that Qm(A) increases with A > 0.
`2) Master Subsystem in Isolation—Sampler: In this section,
`we analyze a Markovian model of the master processor in
`isolation, assuming the scheduling algorithm is sampler.
`We describe the state of the master processor by two
`variables: Xm, the total number of jobs present at the master
`(queued and in service), and an “internal” state Ym, which
`assumes the values “u” or “k” according to whether the job
`presently executing on the master is in user or kernel mode.
`The state space consists of all pairs (Xm,Ym) where Ym is
`either “u” or “k” and Xrn is a (strictly) positive integer. (Under
`sampler, the master is never idle.) We assume jobs arrive in
`kernel mode at Poisson rate A. Interval timeouts also “arrive”
`as a Poisson stream, but at rate 1/Tj. Since we assume there is
`no context switch cost here {Tc = 0), a job may be regarded as
`a two-state Markov chain, alternating between user and kernel
`modes. Taking into account the speed of the master processor,
`the mean holding times in user and kernel modes for a job
`executing on the master are 7},- ///, and Ty /fi, respectively.
`The transition diagram for the Markov chain (2fm,ym) is
`illustrated in Fig. 3. The states (Xm,u) constitute the top row,
`while the states {Xm,k) form the bottom row. Transitions
`from right to left (rate A) correspond to job arrivals, vertical
`transitions (rates h/Tk and p/Tu) to mode switches of the
`currently running job, and diagonal transitions (rate 1/T/) to
`job transfers on interval timeout. An analytical proof of the
`criterion A < A*, for the positive recurrence of the Markov
`chain (Xm,Ym), is given in the Appendix.
`Now, let 7r“, nf denote the equilibrium probability that there
`are i jobs at the master and its internal state is “u” or “k,”
`respectively. Let TTi be the two-dimensional row vector with
`entries tt, = [tt’A Trf], i > 1. The balance equations for the
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2117, p. 6
`
`

`

`968
`
`k :
`Xm :
`
`o A_
`
`*
`
`^7
`I
`
`IEEE TRANSACTIONS ON COMPUTERS, VOL. 40, NO. 8, AUGUST 1991
`
`nr
`
`A_
`
`^7
`I
`
`X_
`
`I
`
`X_
`
`ft
`
`ft
`
`^7
`
`T
`
`• • •
`
`etc.
`• • •
`
`2
`3
`5
`1
`4
`Fig. 3. State transition diagram for master in analysis of sampler.
`
`master determined by Fig. 3 may be written in the form
`0 = 7T;_1 A + TTiB + 7ri+1C,
`i > 2,
`
`(4.7)
`
`The mean number of jobs on the master is found from
`Qm(A) = E[Xm]
`oc
`y^TTt • i
`i=l
`= TTl
`
`oo
`
`i=l
`= TTi(I — R)~21
`(4.16)
`where 1 denotes the two-dimensional column vector of all
`ones. It follows after some algebra that
`X2TvT2k - XiiTyTk + h2{Tv + Tk)
`^{Tu + T^il-X/Xft
`
`•
`
`(4-17)
`
`Qm{X) =
`
`It is intuitively evident that Qm (A) is an increasing function
`of 0 < A < A*. A proof is given in the Appendix.
`3) Master Subsystem in Isolation—Kickto: In this section,
`we analyze a Markovian model of the master processor in
`isolation, assuming the scheduling algorithm is kickto.
`Jobs in kernel mode are assumed to arrive at Poisson rate
`A. The master processor is never idle. If the number of jobs
`present is two or more, then the job in service must be in kernel
`mode, and so receives service at exponential rate (jl/Tk. If
`exactly one job is present, then that job may be either in kernel
`mode receiving service at exponential rate h/Tk or in user
`mode receiving service at exponential rate fi/Tu- If exactly
`one job is present, the current service is user mode, and a
`new job arrives (necessarily in kernel mode), then the current
`job is transferred out and the new job enters kernel mode
`service. We describe the state of the system in equilibrium by
`the nonnegative random variable
`if one job present and serving user mode,
`0,
`if /,; > 1 jobs present and serving kernel
`Ym = \k,
`mode.
`The Markov chain Ym evolves as a simple birth/death process,
`where if k > 1, the transition Ym : A; —» fc + 1 occurs at rate
`A, the transition Yrn : k k — 1 occurs at rate h/Tk, and if
`k = 0, Ym : 0 —> 1 occurs at rate A + ii/Tj;. It is clear that
`the system is stable if and only if
`A < A* =
`
`(4.18)
`
`Tk
`The transition diagram is illustrated in Fig. 4.
`The number of jobs on the master (in user or kernel mode),
`is given by the random variable
`Xm — tH(ix\Ym ^ 1}.
`
`(4.19)
`
`B =
`
`D =
`
`~
`T,
`
`_E_
`Tu
`
`JL
`Tu
`
`Tk .
`
`0 = 7T i ID + 7r2C,
`(4.8)
`where matrices A, B, C, and D account for the transition
`rates into and out of Xm = i,
`A 0
`-A -
`A =
`2.
`0 A
`Tk
`-X-rft
`c- 0
`T,
`Tu
`0 ’
`Tk
`These equations may be solved by setting
`, J > 1
`Tti = TTiiZ
`where R satisfies the matrix equation
`0 = A + RB + R2C.
`(4.10)
`The vector tti is found by solving the boundary equation
`0 = tti [£> + RC].
`(4.11)
`It turns out this matrix geometric solution [15] may be given
`explicitly. Furthermore, (4.11) expresses tti as the stationary
`vector of a two-state Markov chain. We defer the details of
`the solution to the Appendix and instead just state the final
`answer here:
`
`i-l
`
`(4.9)
`
`R=XTt
`
`1
`1
`
`f XT + 2k.
`+ Tu
`+
`+ pTi
`( XT
`)Ik.
`Ik
`Tu
`
`TTl = 7
`
`Tk TuY
`
`1
`
`_ TuTk
`A
`X*JTu + Tk'
`
`(4.12)
`
`(4.13)
`
`(4.14)
`
`The matrix R has two positive real eigenvalues. Let o(R)
`denote its spectral radius. It is shown in the Appendix that
`o(R) < 1 o A < A*.
`(4.15)
`The condition a(R) < 1 is necessary and sufficient for the
`summability of the sequence of vectors 77, i > 1.
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2117, p. 7
`
`

`

`GREENBERG AND WRIGHT: MASTER/SLAVE MULTIPROCESSORSO#o=o=o#o etc.
`
`Vrn : o
`
`2
`3
`4
`1
`Fig. 4. State transition diagram for master in analysis of kickto.
`
`We readily find that
`
`Pr[ym = k\
`
`k— 1
`
`Tu(^-XTk)
`L h(Tu+Tk)
`
`J m Tk(m-ATk)(m+AT[7)
`
`hHTu+Tk)
`
`The mean population is found to be
`
`Qm(A) = E[Xm}
`\2T2kTu - XiiTuTk + h2(Tu + Tk)
`li2{Tu + Tk) (1 — A/A*)
`
`We note that the formula for the mean population has the
`same form as in the case of sampler. Note, however, th

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket