throbber
(
`
`
`
`Sequentiality and Prefetching in Database
`Systems
`
`ALAN JAY SMITH
`
`University of California-Berkeley
`
`Sequentiality of access is an inherent characteristic of many database systems. Weusethis observation
`to develop an algorithm which selectively prefetches data blocks ahead of the point of reference. The
`numberof blocks prefetched is chosen by using the empirical run length distribution and conditioning
`on the observed numberof sequential block references immediately preceding reference to the current
`block. The optimal number of blocks to prefetch is estimated as a function of a numberof ‘‘costs,”
`including the cost of accessing a block not resident in the buffer (a miss), the cost of fetching additional
`data blocks at fault times, and the cost of fetching blocks that are never referenced. We estimate this
`latter cost, described as memorypollution, in two ways. We consider the treatment (in the replacement
`algorithm) of prefetched blocks, whether they are treated as referenced or not, and find that it makes
`very little difference. Trace data taken from an operational IMS database system is analyzed and the
`results are presented. We show how to determine optimal block sizes. We find that anticipatory
`fetching of data can lead to significant improvements in system operation.
`
`prefetching, database systems, paging, buffer management, sequentiality,
`Key Words and Phrases:
`dynamic programming, IMS
`CR Categories:
`3,73, 3.74, 4.33, 4.34
`
`1.
`
`INTRODUCTION
`
`\\
`Database systems are designed to insert, delete, retrieve, update, and analyze
`information stored in the database. Since every query or modification to the
`database will access at least one target datum, and since the analysis associated
`with a query is usually trivial, the efficiency of most database systems is heavily
`dependent on the numberof I/O operations actually required to query or modify
`and the overhead associated with each operation.
`One method used in some systems to reduce the frequency of I/O operations
`is to maintain in a main memory buffer pool a number of the blocks of the
`database. Data’ accesses satisfied by blocks found in this buffer will take place
`
`
`Permission to copy withoutfee all or part of this material is granted provided that the copies are not
`madeor distributed for direct commercial advantage, the ACM copyright notice and the title of the
`publication and its date appear, and notice is given that copying is by permission of the Association
`for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific
`permission.
`The author was supported during the bulk of this research as a visitor at the IBM San Jose Research
`Laboratory. Partial support was later provided by the National Science Foundation under Grant
`MCS.-75-06768. Some computer time was furnished by the Energy Research and Development
`Administration under Contract E(04-3)515.
`Author's address: Department of Electrical Engineering and Computer Sciences and the Electronics
`Research Laboratory, University of California-Berkeley, Berkeley, CA 94720.
`© 1978 ACM 0362-5915/78/0900-0223 $00.75
`ACM Transactions on Database Systems, Vol. 3, No. 3, September 1978; Pages 223-247.
`
`001
`
`Facebook's Ex. 1026
`IPR2019-00516
`
`001
`
`Facebook's Ex. 1026
`IPR2019-00516
`
`

`

`224
`
`.
`
`Alan Jay Smith
`
`much more quickly and usually with much less computational overhead. IMS
`(Information Management System/360) [10-12] uses this Strategy, as we discuss
`later. The INGRES database system [9] makes use of the I/O buffer pool
`maintained by UNIX [19] for this same purpose. There is considerable scope for
`optimizingthe operation ofthe bufferpool, principally by controllingtheselection
`
`St
`
`A. Sequentiality
`A characteristic of the database system that we study in this paper, and we
`believe a characteristic of many other systems,is Sequentiality of access. Many
`queriesrequire scans of an entire database in order to compute an aggregate, An
`example might be: “Find the averagesalary of all residents of New York City,”
`which would probablybecalculated byscanningsequentially over the appropriate
`section of the database. In some systems a simple lookup evokes a sequential
`scan of the entire database,if the database is not indexed on the key, or a partial
`scan, if the index is not precise. For example, a query about “John Jones” might
`be answered by using an index to find the beginning of the records for names
`starting with “J,” and then searching forward, “Range” queries, in which the
`databaseis searched for records with keys in a range, will also result in sequential
`accesses, either to the records themselvesorat least to an indexifthe index exists
`
`data differs significantly from the logical one, then little Sequentiality can be
`expected.
`a
`A consistently sequential pattern of access will allow us to easily anticipate
`which data blocks, Segments, or tuples are likely to be accessed next and tofetch
`them before their useis required or requested. In systems in which anticipatory
`data fetchingis less costly than demand fetching, we can expect a decrease in the
`cost of I/O activity. It is obvious, and under certain conditions has been shown
`formally [1], that when multiple block or anticipatory fetching is no “cheaper”
`per block than demand fetching, demand fetching is an optimal policy. We
`contend, and we shall discuss this in greaterdetail in a later section ofthis paper,
`that fetching segments or blocksin advanceoftheir use, andinparticularfetching
`several segments or blocks at once,is significantly less costly per block fetched
`(with our cost functions) than individual demand fetching.
`B. Previous Research
`
`
`
`
`Cha
`
`ACM Transactions on Database Systems, Vol. 3, No.3, September 1978.
`
`002
`
`|
`Facebook's Ex. love
`IPR2019-00516er
`
`
`002
`
`Facebook's Ex. 1026
`IPR2019-00516
`
`

`

`IMS
`SCUusS
`pool
`ifor
`the
`i
`e
`atch-
`100k
`
`we
`-any
`An
`“Ys
`aate
`ual
`“Hal
`‘ght
`nes
`ene
`nial
`ists
`che
`ally
`she
`abe
`"h
`wy
`ne
`~
`
`—n
`
`“
`
`a
`
`Sequentiality and Prefetching in Database Systems
`.
`225
`[2, 8, 25-28] Sequentialprefetchingoflinesforcache memoryhas been shown to
`work very well [24] because the extent ofsequentiality is large compared to the
`cachepage(line) size; formostprogramsthesequentialityisnotlarge compared
`to the main memorypage size. Prefetchingof1/O data streams hasalso worked
`well [22, 23] because ofthe frequent use ofpurely sequentialfiles,
`:
`C.
`Information Management System/360
`The previous research regarding prefetching whichis discussed above has all
`been oriented toward the analysis and use of data describing real system opera-
`tion. We shall do likewise, and in later Sections of this paper we present data
`taken from an operational and long running Information Management System/
`360 (IMS), a data management system marketed bythe IBM Corporation. This
`‘particular system has been in Operation for a numberof years in an industrial
`environment. The results observed are believed to be illustrative of an actual
`application, but they cannot be taken as typical or representative for any other
`IMSsystem. Despite this disclaimer, and the fact that our research is based on
`data taken from a Specific installation running a specific database system, we
`believe that sequentialityis a characteristic common to many databasesystems,
`and that-therefore the methodology developed in this paper for sequential
`prefetching will also be applicable to other IMS installations and.
`to other
`database systems,
`In orderto provide some backgroundfor the analysis ofour data, we describe
`below someofthesignificant features of IMS. A readable and more complete
`discussioncanbefoundinabookbyDate[5],whichalsodiscussesotherdatabase
`systems andorganizations, The IMS manuals [10-12] provide much more thor-
`ough coverage.
`;
`IMSis a hierarchical databasesystem;thatis, segments (tuples) in adatabase
`are arrangedlogically in a tree structure such as that indicated in Figure 1. An
`IMSimplementation may consist ofseveral databases, each ofwhich consists of
`College
`} Root
`
`ACMTransactions on Database Systems, Vol. 3, No. 3, September 1978,
`
`IPR2019-00516
`
`003
`
`Facebook's Ex. 1026 ©
`
`Departments
`
`’
`
`Chairman
`
`
`
`Room
`Instructor
`Students
`|__Robin,F104]Senior]
`
`
`
`[Joytont,tJ8300|Senior]
`
`[Finch6]7814|sumer]
`
`
`
`Fig. 1
`|
`
`003
`
`Facebook's Ex. 1026
`IPR2019-00516
`
`

`

`
`

`
`
`
`226
`
`.
`
`Alan Jay Smith
`
`and so on.
`
`a large number ofsuch “trees.” The roots may or may not be sorted and may or
`may not be indexed. In the system measured there were several databases; each
`had the roots indexed. Access to units within a tree is through the root. Within
`the tree, data is stored in a top-down,left-to-right order. That is, the following
`items from Figure 1 would be stored in this order: Engineering, Mechanical
`Engineering, Brown, M24 Basket Weaving, 39-311, 14:05, Prof. Smith, Finch C
`7614 Junior, Jayhawk L 8302 Senior, Robin F 1104 Senior, M23 Floatation,...,
`Two physical Storage organizations are possible, hierarchical sequential and
`hierarchical direct. In the former case segments are examined consecutively in
`the order in which they are stored in order to find someposition in the tree. In
`the latter case, which describes the databases in this system, certain segment
`instancesare related by two-way pointers and these pointers can be used to skip
`over intervening data. In either case the data is accessed using DL/1 (Data
`Language/1), which implementsnine operations. These are: GET UNIQUE(find
`a specific item), GET NEXT (get the next item), GET NEXT WITHIN PARENT
`(get next item under the same root), GET HOLD UNIQUE, GET HOLD NEXT,
`GET HOLD WITHIN PARENT, INSERT, REPLACE,and DELETE.In each
`of the first six cases, the command can or must be qualified to partially or
`completely identify the tuple in question. GET HOLD (items 4, 5, and 6)is
`usually user prior to INSERT, DELETE, and REPLACE. Eighty-one percent of
`all accesses measured (see below) proved to be for lookup only (GET) rather
`than modification (INSERT, REPLACE, DELETE). Average frequency for
`INSERT, REPLACE, and DELETE was 11 percent, 0.7 percent, and 8 percent,
`respectively, with substantial variation between databases,
`.
`A search for
`a uniquely qualified item, such as GET” UNIQUE
`(College(Engineering).Department(Mechanical)Chairman) will involve a search
`starting at the appropriateroot and either scanning in sequential storage orderif
`the storage organization is sequential, or following pointersif the organization is
`direct. Eachofthe segmentsreferenced in the courseoffinding thé target segment
`is called a path segment. A “GET NEXT”will also involve either sequential
`search or search following pointers. Unless the pointers lead immediately to the
`desired item, a significant amount of sequentiality will be evident in both cases;
`in following a path to the target the direction of search is always forward in the
`A segment can be referenced only if it is in memory. For several efficiency
`reasons, IMS groups segments into fixed size storage units called physical blocks
`(henceforth called blocks). In the particular implementation in question the block
`size was 1690 bytes for data blocks and 3465 bytes for blocks used to hold index
`entries for the root nodes. The blocks are the unit of physical storage and data
`transmission; a request for a segment will result in the block containing the
`segmentbeing transmitted,if necessary, from secondary storage to a bufferpool
`of blocks kept in main memory.
`The search for a target segment proceeds as follows:
`i. Determine the first path segment to be examined.
`ii, Search the buffer pool for the block containing the segment.If the block is
`missing, then fetch the block (and remove a block if necessary to make room).
`ACM Transactions on Database Systems,Vol. 3, No. 3, September 1978.
`
`database.
`
`004
`
`Facebook's Ex. 1026
`IPR2019-00516
`
`004
`
`Facebook's Ex. 1026
`IPR2019-00516
`
`

`

`
`
`227
`:
`Sequentiality and Prefetching in Database Systems
`iii. Find the segment within the block. If this is the target segment, then
`perform the desired operation. Otherwise, determine the next path segment and
`continue at step 2.
`Wenote that Segments are commonly much smaller than blocks; for our data
`the average segment size was 80 bytes. A large block size will result in many
`segments being fetched at once; because of the sequential nature of segment
`
`was used without change, except that the reloading ensured thatthe logical and
`physical database organizations coincided.
`The. database system used was IMS/360 version 2.4, running under OS/VS2,
`release 1.6. Thetotal size of the entire database was about 200 megabytes.
`Our data analysis, in later sections, uses three sections of the entire seven-day
`block reference trace. The part labeled “full tape”is the trace for thefirst day.
`This first-day tape actually was generated in seven sections of approximately
`equal size; the parts of the-trace referred to as “part 1” and “part 2” are just the
`first two segments ofthis day-long trace.
`This data was gathered by researchers at the IBM San Jose Research Labo-
`ratory, and further discussions of IMS,the data gathering methodology, and the
`papers by Tuel and Rodriguez-Rosell [30], Rodriguez-Rosell and Hildebrand
`(21], Ragaz and Rodriguez-Rosell [18], Rodriguez-Rosell [20], Gaver;"Lavenberg,
`and Price [7], Lavenberg and Shedler [14], Lewis and Shedler [16], Lavenberg
`and Shedler [15], and Tuel [29] for further information.
`2. OPTIMIZATION OF PREFETCHING
`A. The Model
`As was discussed in the beginning ofthis paper, the efficiency of operation of a
`database system can be increased by increasing the fraction of data references
`captured by the main memory buffer. The only general method known bythe
`authorto beeffective in increasing the buffer hit ratio (for a fixed size buffer) is
`to take advantage ofsequentiality in the series ofaccesses—either by prefetching
`or by increasing theblock size. This issue is discussed further in the data analysis
`section of this paper; for the formulation of our optimization results we rely on a
`model asfollows:
`i. The physical layout of the database is the linear projection (as defined
`earlier) of the logical structure of the database. Segmentsin the databasewill be
`assumed to be numbered consecutively from 1 to S. Blocks in the database .
`ACMTransactions on Database Systems, Vol. 3. No.3, September 1978.
`
`005
`
`Facebook's Ex. 1026
`IPR2019-005 16
`
`005
`
`Facebook's Ex. 1026
`IPR2019-00516
`
`

`

`228
`
`:
`
`Alan Jay Smith
`
`(
`
`
`
`B.
`
`,
`
`(physical groupings ofadjacent segments) are numberedconsecutively from 1 to
`ii. The reduced block reference string is the concatenation of a sequence of
`runs, each run having a length that is independently and identically distributed
`with probability mass function l(k), k = 1,..., 2.
`The reduced block reference string (RBRS)is the sequence of block references
`from which all immediate rereferences have been deleted. Thusif the original
`block reference string was b;, bi, by, 8;, bj, bi, bs, bg, bg, .... then the RBRS would be
`bi, 0;, bj, b:, bg, .... It isn’t known whether knowledge of immediate rereferencing
`provides useful informationfor either prefetching or replacement, but we choose
`not to use that information here.
`, A run is a sequence of reduced block references, such that the blocks are
`numberedconsecutively. More specifically, let bj, bj+1, «5 Bjeny Oj4n+1 be an RBRS,
`and let bj41 4b; +1, bjree ¥ Bjre + 1, and 6; + 1 = ba forj < i<j +k. Then the
`RBRSsequence }j+1,
`..., bj+4 is defined to be a run of length k. We define the
`predecessor of block 8; to be block 8; — 1 and the successor to be block 8; + 1.
`From our definition, it may be seen that a block which is not preceded in the
`reference string by its predecessor nor succeeded by its successor constitutes a
`run of length 1.’
`Wedenote the probability distribution for the run length as follows:
`E(k)
`=the probability mass function for the run length. /(%) is the probability
`that a run is exactly k blocks long.
`L(k)|=the cumulative mass function for the run: length = ¥, Xi).
`L‘(k)|=the survivor function for the run length = 1 — L(A).
`L*(j|k) =the survivor function, given that the run is oflength & or greater, equal
`to L°(j)/L(k — 1), j= k; equal to 1 otherwise.
`=the hazard rate, equal to I(k)/L‘(k — 1), the probability that the run
`ends with the kth element given thatit is at least k blocks long.
`=mean run length = )7.; il(i).
`.
`
`H(k)
`
`L
`
`B. The Costs of Sequential Prefetching
`Our probabilistic model for the reference string as a concatenation of runs of
`independentandidentically distributed length suggests that we condition on the
`previously observed run length to decide how manyblocks ahead to prefetch. To
`do so, we must determine the utility of successful prefetches and the problems
`associated with unsuccessful ones. Wetherefore identify several costs in executing
`I/O operations: CPU timeis necessary to initiate I/O operations and to update
`all appropriate tables. The transfer of data requires memory cycles and thus
`interferes with CPU operation. Transfer time, and often seek and latency time,
`depending on the secondary storage device, prevent data from becoming imme-
`
`' The function ofourdefinition of a runis to allowusto estimate the utility of prefetching by assuming
`that blocks are only used in runs.In fact, blocks may be referenced out of order, and a prefetched
`block might be used, but not as part of the run for which it was fetched. Thus we could define a
`generalized run such that block }; is the kth element of a generalized run if the predecessorof block
`5, is in memory at the time block 6;, is referenced and if block 6, — 1 is the (R — 1)-st element of a
`generalized run. Such a generalization is beyond the scopeofthis study, both becauseofits complexity
`and because of its dependence on thesize of the memory.
`ACMTransactions on Database Systems, Vol. 3, No.3, Septemher 1978.
`
`
`
`| 006
`
`Facebook's Ex. 1026
`IPR2019-00516
`
`006
`
`Facebook's Ex. 1026
`IPR2019-00516
`
`

`

`
`
`{
`
`
`
`
`
`Sequentiality and Prefetching in Database Systems
`
`.
`
`229
`
`diately available. The CPU in this case must either remain idle while the
`information is fetched, or it must switch to another task, which again involves
`overhead. Every time a block is fetched it may displace another block that is
`aboutto be rereferenced. Thus, excessive fetching of unused blocks may actually
`increase the missratio.
`Weshall group all of the above costs into three strategy costs and one effect
`cost. Possible strategies, which involve strategycosts, are the following: (a) Fetch
`a block on demand (as needed), (b) fetch a block in advance but not simultan-
`eously with a demandfetch, and (c) fetch additional sequentially adjacent blocks
`(“tag-along” blocks) when a block is fetched according to strategy (a) or (b). We
`identify these costs as follows:
`DFC: Demand Fetch Cost—the cost of fetching one block when it is needed
`immediately.
`PFC: Pre-Fetch Cost—the cost of fetching one block when it is not needed
`immediately and whenit is not a “tag-along” block.
`TAC: Tag-Along Cost—the cost for each additional block transferred from
`secondary storage whentheinitial block was fetched by strategy(a) or (b).
`Wealso identify the following effect cost:
`BFC: Bad Fetch Cost—thecost, in additional fetches, resulting from bringing in
`a block thatis not actually used. This latter effect is also known as memory
`pollution.
`In determining and evaluating our cost function, we will make the following
`assumptions in addition to those of our model:
`iti, DFC, PFC, TAC, and BFC includeall important costs, and further, the
`total cost can be expressed as a linear combination of these individualcosts.
`iv. Blocks that are prefetched and used whilestill in memory do not contribute
`to memory pollution.
`\
`v. The effect of memory pollution is independent of the block replacement
`algorithm and the memory capacity.
`-
`vi. There is nothing other than run length that contributes to effective predic-
`tion.
`Wewill discuss the validity of some of these assumptionsin later sections of
`this paper.
`Let us assume somestrategy for prefetching. Associated with this strategy is a
`set of numbers(DF), E(PF), E(TA), and E(BF), where these are respectively
`the expected riumber of demandfetches, prefetches, tag-along blocks, and bad
`fetches per run. The expected cost per run C underthis strategy is then
`(1)
`C = DFC*E(DF) + PFC*E(PF) + TAC*E(TA) + BFC*E(BF).
`From our assumptions above, specifically from ii and vi, to optimize system
`operationit is sufficient to minimize the expected cost per run by choosing the
`best prefetching strategy. We find that strategybelow.
`
`C. Prefetch Strategy Optimization
`Weshall consider twoclasses of prefetch policies; the second will be a generaliza-
`tion of thefirst:
`
`ACMTransactions on Database Systems,Vol. 3, No. 3, September 1978.
`
`007
`
`Facebook's Ex. 1026
`IPR2019-00516
`
`007
`
`Facebook's Ex. 1026
`IPR2019-00516
`
`

`

`230
`
`.
`
`Alan Jay Smith
`
`Class 1 prefetch policies permit any number / additional sequential blocks to
`be prefetched at a time when a demandfetchis required. No fetches are permitted
`at other times. In many systems, PFC is comparable to DFC, so this restriction
`follows naturally.
`,
`Class 2 prefetch policies permit any number/ additional sequential blocks to
`be prefetched beyondthe current reference at any time, whether or not a miss
`occurred on this reference. From our set of costs, it is clear that there is no
`advantage to prefetchingif at the time of a reference to block bz, block 6; + 1is
`already in memory. Therefore it is assumed that prefetches are initiated only
`when the current reference causes a fault or the block beyond the current
`reference is not in the buffer.
`1. Class 1 Prefetch Optimization. We define the remaining cost for a run
`C(k), and the minimum expected remaining cost for a run Cnin(k), as follows:
`The remaining cost C(z) for a run is the expected cost of fetching all remaining
`blocks of the current run, given that (exactly) k — 1 blocks have already been
`fetched and that the run is # or more blocks in total length. The minimum
`expected remaining cost is taken overall possible class 1 fetch policies.
`The value of Cyin(k) may be simply computed:It is the cost of fetching the kth
`block in the run (DFC) plus the cost ofj additional blocks U*TAC, 7 = 0), plus
`the cost of the remainder of the run (L°(k + JNR)*(Crin(k + 7 + 1))), plus the
`pollution cost (BFC+S33 Uk + aU — )/L(Rk — 1)). This may be seen by
`inspection (when the termsare collected) to be:
`Cnintk) = DFC + min [rac + L°(k.+j\R)*Crinlk +7 + 1)
`(2)
`+ BCS Uk +) - )/LR vf.
`The mean minimum costper runis then Cnin(1), and the mean cost per (reduced)
`reference is Crin(1)/L. (The cost per actual reference can be obtained by dividing
`the cost per reduced reference by the ratio of real references to reduced refer-
`ences.)
`a
`If the run length is bounded, C,i2 may be computed straightforwardly: Let Rmex
`be the largest possible run length. Then Cmin(Rmax) = DFC. Cmin(k), Rk < max, iS
`a function of Cnin(7), Rmax 2 j > k, and other known parameters only, and the
`computation is thus simple and proceeds from k = Rmax.to k = 0. Since we shall
`estimate the run length distribution L(#) from the observed sample £(k), the run
`length is always bounded. We hypothesize that for “well-behaved”distributions
`L(k), arbitrary truncation of L() producesnoill effects. Since L°(k + Jz), J = 1,
`is alwaysless than one, the effect of errors in Cmin(j + & + 1) can be seen to die
`out.
`The computation of C,,in also provides the optimal fetch policy. At the time of
`a fault to the kth block in the run,it is optimal to fetch a(k) additional blocks,
`where a(k) is the value of j that minimizes the expression above in eq. (2). We
`note that after this computation has been performed, many values of Crin(k) and
`a(k) are superfluous since there will never be a demandfetch to the fth block in
`the run. For example, let the optimum values of a be respectively a = {1, 1, 3, 3,
`ACM Transactions on Database Systems, Vol. 3, No. 3, September 1978.
`
`J
`J20
`
`
`
`
`
`008
`
`Facebook's Ex. 1026
`IPR2019-00516
`
`008
`
`Facebook's Ex. 1026
`IPR2019-00516
`
`

`

`
`
`
`
`Sequentiality and Prefetching in Database Systems
`
`.
`
`231
`
`2, 4, 4} for k = 1 to 7. Then a fetch to the first block in a run will result in its
`successor being fetched. Therefore, there can never be a fault to the second item
`in a run. Similarly, the fault to the third block in a run fetches also the fourth,
`fifth, and sixth. Thusitis sufficient to specify the fetch policy only at the possihle
`fault points.
`2. Class 2 Fetch Policies.
`In the case that fetching at other than fault times
`is permitted, we must define a slightly different cost function. Let Cfmin(k) be the
`. minimum remaining future costs of a run, where this is defined to be the cost of
`accessing all remaining blocks of the run, given that the runis & or more blocks
`in length and that exactly & blocks have already been fetched. (Note thedifference
`from the preceding definition.) We are led to this different definition because at
`the time that we reference block & in the run, we must decide whether to bring
`in block & + 1 at this time, or to wait to see if it is referenced first. The cost
`breaks down into two parts, depending on this decision, and may beseen to be:
`Cfnin(k) = min {|prc + min {J*TAC + L'(j + BR)Cfrinlk +7 +1)
`+ BFC*S) UR + i(F+1—a)/L(k - v}
`[e<ai(ore+min [rac +
`
`Fe
`
`i=)
`
`J
`
`Lh + jk + UD *Chain(k +7 + 1)
`yl
`
`i=0
`
`(3)
`+ BFC*Y WkR+i+ NG - vew})]}
`Again, we obtain the optimal fetch policy as a consequenceof our computation
`of Cfnin. Let B(R) be the value ofj that minimizes the first term in eq. (3) above,
`andlet y(k) be the value ofj that minimizes the second term.If the first term is
`less, it is optimal to fetch a number ofadditional blocks (B(k) +_1’blocks) even
`thoughit is not a fault time; if the second term is less, it is optimal to wait until
`the (k + 1)-st block is referenced (if it is), at which time y(k) additional blocks
`should be fetched.
`The mean minimum cost per run is DFC + Cfmin(1), and the mean minimum
`cost per (reduced) reference is (DFC + Cfnin(1))/L.
`3. COST ESTIMATION
`A. Estimation of DFC, PFC, and TAC
`In Section 2C we described an algorithm for computing the cost to bring into the
`memory buffer the blocks in one run as a function of four parameters, DFC
`(demand fetch cost), PFC (prefetch cost), TAC (tag-along cost), and BFC (bad
`fetch cost), values for which have not yet been assigned. In this section we shall
`discuss the choice of values for these parameters.
`Wewill arbitrarily assign DFC a value of 1.0. All othercosts will be assigned
`relative to this fixed value. Since our cost structure uses a relative and not
`absolute scale, we have lost no generality.
`.
`ACMTransactions on Database Systems, Vol. 3, No. 3, September1978.
`
`009
`
`Facebook's Ex. 1026
`IPR2019-00516
`
`009
`
`Facebook's Ex. 1026
`IPR2019-00516
`
`

`

`
`
`
`
`232
`
`:
`
`Alan Jay Smith
`
`The values assigned to PFC and TAC (relative to DFC) are a function of the
`architecture of the I/O devices, the load on the system,the quality of the system
`implementation, and the degree of multiprogramming, among other factors. The
`costs for a prefetch include the CPU timeinvolved in constructing andinitiating
`the channel program; the CPUidle time induced by the period during which the
`channel, controller, and device are unavailable; the memoryinterference produced
`by the transfer; and the possible delay in completing the fetch should the
`prefetched block be needed quickly. The tag-along cost is almost the sameas the
`above, but includes the marginal additional cost for the channel program per
`block transferred rather than the bulk of the overhead. Each of the components
`of these costs will vary among machines and operating systems, with time
`dependent factors such as the load on the system, and with the system configu-
`ration (numberandtype of I/O devices). We will therefore discuss somedifferent
`Scenarios and somecost functions which might follow. Specific examples of cost
`functions will be chosen andcalculations presented in Section 4, where wediscuss
`the data we have analyzed. We note that the only way to determine a set of
`verifiable values for these costs is to vary the prefetching algorithm on a specific
`system driven repeatedly by the same benchmark andto estimate the parameters
`statistically from the observed behavior.
`The simplest case is one in which the system is completely I/O bound;thatis,
`the system is assumed to always be waiting for an I/O operation to complete.
`Consider quarter track blocks and random seeks; then the mean transfer, latency,
`and seek times for the IBM 2314 disk are respectively 7% msec, 52 msec, and 60
`msec. For the 3330 disk, the values are 4.2 msec, 8.4 msec, and 30 msec.(In fact,
`random seeks are unlikely (see Smith [22, 23] for a relevant discussion), so the
`meanseek time should probably beless than given.) The marginal wait for a tag-
`along block is then (2%)/(60 +.256 + 254) relative to the demandfetch wait for the
`2314, or 0.079 (0.099 for the 3330). For 4 track data sets, the value would be 0.147
`and 0.1721for the two disks. Adding in the costs of the other overheads mentioned
`above, a valueof0.2 for TAC seems reasonable. The value of PFC ishigher, since
`it includes the overhead for performing the I/O and may includemore nonover-
`lapped transfer time. PFC is almost certainly lower than DFCsince some of the
`I/O wait time is assumed to be overlapped with the processing of the remaining
`block in the run. A value of 0.7 seems to be a plausible value for PFC.
`An alternative assumption for system operation is that the CPU is fully utilized
`and that the demandson the CPU limit the throughputof the system. This would
`be reasonable in a system with many storage (I/O) devices and a great deal of
`memory, so thatall I/O could easily be overlapped with computation. In this
`case the values assumed by DFC, PFC, and TAC depend on the exact time it
`takes to execute the appropriate code(I/O initiation, task switching, if any, etc.).
`It has been suggested to the author that valuesof 0.2 and 0.9 for TAC and PFC
`seem reasonable.
`For other cases, intermediate between I/O bound and CPU bound,the values
`assigned to the cost variables will vary. It is to be hoped that because of the
`uncertainty in the cost variables, the cost function will be relatively flat for
`varying. degrees of prefetching. Computations performed by the author indicate
`that this is the case.
`
`ACM Transactions on Database Systems, Vol. 3, No. 3, September 1978.
`
`010
`
`Facebook's Ex. 1026
`IPR2019-00516
`
`010
`
`Facebook's Ex. 1026
`IPR2019-00516
`
`

`

`Sequentiality and Prefetching in Database Systems
`
`.
`
`233
`
`B. Estimation of BFC
`It remains to estimate a value for BFC, the additional cost for fetching a block
`that is never used. This additional cost arises in the case that a block is removed
`from the buffer pool in order to make room for the new fetch and that this old
`block is quickly rereferenced. We assumethat every block that is expelled from
`the buffer and then reused is fetched at a cost of 1.0; we now estimate the
`probability that a block so expelled is reused.
`Before we describe two methods of calculating the value of BFC, we define a
`specific paging algorithm. In MULTICS [4] and CP-67 [3], a paging algorithm
`commonly referred to as the “clock” paging algorithm is used. We define here
`the:
`GENERALIZED CLOCK PAGE REPLACEMENT ALGORITHM. With
`each page frame in memory weassociate a countfield and we arrange these count
`fields in a circular list. Whenevera pageis referenced, the associated countfield
`is set to i, When a pagefault occurs, a pointer that circles around this circular list
`of page framesis observed.If the count field pointedto is zero, then the pageis
`removed and the new page is placed in that frame. Otherwise, the count is
`decrementedby 1, the pointer is advancedto the next count field, and the process
`is repeated. When anew pageis placed in the page frame, the countfieldis set to
`i if the page is to be referenced (demandfetch) and it is set to J if the page has
`been prepagedandis not immediately referenced. We abbreviate this algorithm
`by writing CLOCKP(/,i). The “P” indicates that this is a prepaging algorithm
`(the prepaging strategy has not been specified). When no prepagingis involved,
`the algorithm is abbreviated CLOCK(z). The algorithm used in MULTICS and
`CP-67 is CLOCK(I).
`Wehavechosento definethis algorithm in such a manner becausethe existence
`of the parametersi and j will facilitate certain experiments to be described.
`Wewill also have occasion to refer to the Working Set paging algorithm [6]
`and the Least Recently Used (LRU) pagingalgorithm [17], with which we assume
`the readeris familiar.
`a
`Wewill employ two methods to estimate BFC; both methods are rather ad
`hoc

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket