IPR2015-00158, No. 1007 Exhibit - S Park et al, Verification of Cache Coherence Protocols by Aggregation of Distributed Transactions 1998 Park (P.T.A.B. Nov. 4, 2014)

Theory Comput. Systems 31, 355–376 (1998)
`
`Theory of
`Computing
`Systems
`© 1998 Springer-Verlag
`New York Inc.
`
`Veriﬁcation of Cache Coherence Protocols by
`⁄
`Aggregation of Distributed Transactions
`
`S. Park and D. L. Dill
`
`Department of Computer Science, Stanford University,
`Stanford, CA 94305, USA
`spark@cs.stanford.edu
`dill@cs.stanford.edu
`
`Abstract. This paper presents a method to verify the correctness of protocols and
`distributed algorithms. The method compares a state graph of the implementation
`with a speciﬁcation which is a state graph representing the desired abstract behavior.
`The steps in the speciﬁcation correspond to atomic transactions, which are not atomic
`in the implementation.
`The method relies on an aggregation function, which is a type of abstraction
`function that aggregates the steps of each transaction in the implementation into a
`single atomic transaction in the speciﬁcation. The key idea in deﬁning the aggre-
`gation function is that it must complete atomic transactions which have committed
`but are not ﬁnished.
`This paper illustrates the method on a directory-based cache coherence pro-
`tocol developed for the Stanford FLASH multiprocessor. The coherence protocol
`consisting of more than a hundred different kinds of implementation steps has been
`reduced to a speciﬁcation with six kinds of atomic transactions. Based on the re-
`duced behavior, it is very easy to prove crucial properties of the protocol including
`data consistency of cached copies at the user level. This is the ﬁrst correctness proof
`veriﬁed by a theorem-prover for a cache coherence protocol of this complexity. The
`aggregation method is also used to prove that the reduced protocol satisﬁes a desired
`memory consistency model.
`
`⁄
`
`2-891.
`
`This research was supported by the Advanced Research Projects Agency through NASA Grant NAG-
`
`Petition for Inter Partes Review of
`U.S. Pat. No. 7,296,121
`IPR2015‐00158
`EXHIBIT
`Sony‐
`
`

`
`356
`
`1.
`
`Introduction
`
`1.1. Basic Idea
`
`S. Park and D. L. Dill
`
`Protocols for distributed systems often simulate atomic transactions in environments
`where atomic implementations are impossible. This observation can be exploited to make
`formal veriﬁcation of protocols and distributed algorithms using a computer-assisted
`theorem-prover much easier than it would otherwise be [34]. Indeed, the techniques de-
`scribed below have been used to verify safety properties of signiﬁcant examples: the cache
`coherence protocol for the FLASH multiprocessor which is currently being designed at
`Stanford [15], [20], a majority consensus algorithm for multiple copy databases [41],
`[18], and a distributed list protocol [9].
`The method proves that an implementation state graph is consistent with a speci-
`ﬁcation state graph that captures the abstract behavior of the protocol, in which each
`transaction appears to be atomic. The method involves constructing an abstraction func-
`tion which maps the distributed steps of each transaction to the atomic transaction in the
`speciﬁcation. We call this aggregation, because the abstraction function reassembles the
`distributed transactions into atomic transactions.
`This method addresses the primary difﬁculty with using theorem-proving for ver-
`iﬁcation of real systems, which is the amount of human effort required to complete a
`proof, by making it easier to create appropriate abstraction functions. Although our work
`is based on using the PVS system from SRI International [33], the method is useful with
`other mechanical theorem-provers, or manual proofs.
`Although ﬁnite-state methods (e.g., [32], [10], [17], and [19]) can solve many of the
`same problems with even less effort, they are basically limited to ﬁnite-state protocols.
`Finite-state methods have been applied to non-ﬁnite-state systems in various ways [38],
`but these techniques typically require substantial pencil-and-paper reasoning to justify.
`Moreover, it is not obvious how to apply these extensions to the examples we veriﬁed
`using aggregation. Theorem-provers make sure that such manual reasoning is indeed
`correct, in addition to making available the full power of formal mathematics for proof,
`so they can routinely deal with problems that cannot yet be solved by any ﬁnite-state
`methods.
`For our method to be applicable, the description must have an identiﬁable set of
`transactions. Each transaction must have a unique commit point [16], from which a
`state change cannot be aborted (usually, it is the point at which a state change ﬁrst
`becomes visible to the speciﬁcation). The most important idea in the method is that the
`aggregation function can be deﬁned by completing transactions that have committed but
`not yet completed. In general, the steps to complete separate transactions are independent,
`which simpliﬁes the deﬁnition of this function. In our experience, this guideline greatly
`simpliﬁes the deﬁnition of an appropriate aggregation function.
`The same idea of aggregating transactions can be applied to reverse engineer a spec-
`iﬁcation where none exists, because the speciﬁcation with atomic transactions is usually
`consistent with the intuition of the system designer. We extract a speciﬁcation model
`which performs transactions atomically at their commit steps in the implementation,
`and does nothing at other steps. The extracted speciﬁcation provides an illusion that the
`transactions take effect instantaneously at the commit steps in the implementation.
`
`

`
`Veriﬁcation of Cache Coherence Protocols by Aggregation of Distributed Transactions
`
`357
`
`If the extracted speciﬁcation is not obviously complete or correct, it can instead be
`regarded as a model of the protocol having an enormously reduced number of states. The
`amount of reduction is much more than other reduction methods used in model checking,
`such as partial order reduction, mainly because the state variables of the reduced system
`are only those relevant to the speciﬁcation, without variables such as local states and
`communications buffers.
`The major contribution of this work is to reduce the effort required to prove the
`correctness of certain classes of cache coherence protocols (and possibly other types
`of distributed algorithms). The methodology requires extracting or deﬁning transaction-
`oriented speciﬁcations. Once the speciﬁcations are in this form, aggregation provides a
`simple method for deﬁning abstraction functions based on completing unﬁnished trans-
`actions across distributed processing elements. The effectiveness of aggregation is shown
`by the example we present below: FLASH is the most complex multiprocessor cache
`coherence protocol that has been formally veriﬁed using a theorem-prover.
`
`1.2. Related Work
`
`1.2.1. Veriﬁcation of Cache Coherence Protocols. One widely used technique for val-
`idating cache coherence protocols is ﬁnite-state methods (e.g., model checking). Finite-
`state methods enumerate the states of the reachable state graph of the system, searching
`for states that violate a speciﬁed property [31], [5], [40], [32], [17], [19]. These methods
`suffer from the state explosion problem: the number of states for nontrivial numbers of
`processors and cache lines is very large. Another problem with model checkers is that
`it is very difﬁcult to specify correctness conditions of the protocol using notations such
`as Mur’ or temporal logic. The speciﬁcation is the corresponding memory model of the
`protocol so it is required to encode a full memory model in temporal logic.
`Symbolic state models proposed by Pong and Dubois [38], [37] reduce the state
`explosion problem by using symbolic states which abstract away from the exact number
`of conﬁgurations of replicated identical components by recording only whether there
`are zero, one, or more than zero replicated components. However, as in model checking,
`there still remains the problem of specifying the protocol: It is not easy to ﬁnd a set
`of properties (in their notation), which completely describes the correct behavior of the
`protocols. Moreover, their method requires the user to write an abstract description of
`the protocol to be veriﬁed, which raises another veriﬁcation problem: Are the abstract
`description and the actual protocol equivalent?
`Another approach to formal veriﬁcation is computer-assisted theorem-proving.
`Theorem-provers make available the full power of formal mathematics for proof, so
`they can routinely deal with problems that cannot yet be solved by any ﬁnite-state meth-
`ods. However, the major problem with theorem-proving is that considerable labor is
`required. Consequently, previous theorem-proving approaches have not been able to
`verify a problem of the scale of a full multiprocessor cache coherence protocol. The
`most signiﬁcant result before our work is a manual proof of “lazy caching,” a simple
`and abstract cache coherence algorithm [2], [13], [21]. It should be noted that using a
`theorem-prover typically increases the labor required to complete a proof compared with
`manual proof—however, the results are much more likely to be correct.
`
`

`
`358
`
`S. Park and D. L. Dill
`
`1.2.2. Abstraction Function. The idea of using abstraction functions to relate imple-
`mentation and speciﬁcation state graphs is very widely used, especially when manual or
`automatic theorem-proving is used [30], [29], [22] (indeed, whole volumes have been
`written on the subject [8]). The idea has also been used with ﬁnite-state techniques [19],
`[11].
`Ladkin et al. [21] have used a reﬁnement mapping [1] to verify a simple caching algo-
`rithm. Their reﬁnement mapping hides some implementation variables, which may have
`the effect of aggregating steps if the speciﬁcation-visible variables do not change. Our
`aggregation functions generalize on this idea by merging steps even when speciﬁcation-
`visible variables change more than once. This happens in most cache coherence protocols
`for distributed systems. For example, in the FLASH protocol described later in this paper,
`the cache states (speciﬁcation variables) can change twice during a write transaction:
`ﬁrst, if another cache has a copy in exclusive state, the state must transition to invalid;
`then the requesting cache state changes from invalid to exclusive when the cache receives
`data. In this case, our aggregation function modiﬁes the cache state and data to complete
`the transaction.
`A more limited notion of aggregation than ours is found in [24] and [25], where a
`state function undoes or completes an unﬁnished process. The method only aggregates
`sequential steps within a local process. The idea of an aggregated transaction has been
`used to prove a protocol for database systems [36], where aggregation is obtained also in
`a local process by showing the commutativity of actions from simple syntactic analysis.
`Ours is the ﬁrst method that aggregates steps across distributed components.
`In program veriﬁcation, proofs can be simpliﬁed by pretending that a statement is
`atomic if its execution contains at most one access of a shared variable. This is the so-
`called “single-action rule” [28], [12], [26]. The single-action rule is generalized in [27].
`This method classiﬁes program statements as “left-movers” or “right-movers” depend-
`ing on their commutative properties. Using these properties, the statements are permuted
`to obtain a coarser-grained version of the program, for which safety properties can be
`checked.
`Cohen used an idea similar to aggregation to prove global progress properties by
`combining progress properties of local processes [6]. The idea of how to construct our
`aggregation function was inspired by a method of Burch and Dill for deﬁning abstraction
`functions when verifying microprocessors [3].
`
`1.3. Contents of the Paper
`This paper is organized as follows. Section 2 deﬁnes the veriﬁcation goal and Section 3
`presents the veriﬁcation method. Section 4 describes the FLASH cache coherence proto-
`col in two ways: in terms of transactions and per-node-based steps. Section 5 illustrates
`the method on the FLASH protocol. Using the reduced model obtained by aggrega-
`tion, Section 6 proves that one of the two distinct modes supported by the protocol
`implements a sequential consistency memory model. Finally, Section 7 concludes and
`proposes possible lines of future research.
`
`2. Veriﬁcation Objective
`
`The goal of formal veriﬁcation is usually to show that two alternative descriptions of the
`same behavior are consistent. The notion of consistency varies according to the details
`
`

`
`Veriﬁcation of Cache Coherence Protocols by Aggregation of Distributed Transactions
`
`359
`
`of the veriﬁcation problem. The aim of this section is to deﬁne the objective of formal
`veriﬁcation precisely for transaction-oriented protocols. To do so, we must ﬁrst describe
`our model of these protocols.
`The veriﬁcation method begins with two logical descriptions: a description of the
`state graph of the implementation, and a description of the state graph of the speciﬁcation.
`The implementation description contains a set of state variables, which is partitioned
`into speciﬁcation variables and implementation variables. The set Q of states of the
`implementation is the set of assignments of values to state variables. The description of
`the implementation also includes a logical formula deﬁning the relation between a state
`and its possible successors. The next state choices are represented by a set of functions F
`from states to states. Each function in F maps a given implementation state to a possible
`successor state. The implementation is nondeterministic ifF has more than one function.
`We could also have represented the next state choices as a relation, but the “set of
`functions” representation is much more suitable for the veriﬁcation method we describe
`below. It is also easy to represent protocols in this way. For example, languages of iterated
`guarded commands, such as Murphi [10] and UNITY [4] can be translated directly into
`the above representation.
`The description of the speciﬁcation state graph is similar to the implementation
`description. A speciﬁcation state is an assignment of values to the speciﬁcation variables
`of the implementation (implementation variables do not appear in the speciﬁcation).
`Also, every state in the speciﬁcation has a transition to itself, which we call an idle
`step. The idle steps are necessary to represent implementation steps that do not change
`speciﬁcation variables.
`The veriﬁcation method relies on there being a set of transactions which the com-
`putation is supposed to implement. A transaction is atomic at the speciﬁcation level,
`meaning that it occurs in a single state transition in the speciﬁcation. However, transac-
`tions in the implementation are nonatomic; they may involve many steps that are executed
`in several different components of the implementation.
`Each transaction in the implementation must have an identiﬁable commit step. In-
`tuitively, when tracing through the steps of a transaction, the commit step is the im-
`plementation step that ﬁrst makes an inevitable change in the speciﬁcation variables.
`Implementation states that occur before the transaction or during the transaction but be-
`fore the commit step are called precommit states for that transaction. The transaction is
`complete when the last speciﬁcation variable change occurs. The states after the commit
`step but before the completion of the transaction are called postcommit states for the
`transaction. A state where every committed transaction has completed is called a clean
`state.
`Formally, all of the above concepts can be derived once the postcommit states are
`known for each transaction. The precommit states for the transaction are the states that
`are not postcommit; the commit step for a transaction is the transition from a precommit
`state to a postcommit state for that transaction; and the completion step is the transition
`from a postcommit state to a precommit state. A state is clean if it is a precommit state
`for every transaction.
`We can now describe our objective in formally verifying transaction-oriented pro-
`tocols. We suppose that the designer of a protocol understands how the implementation
`steps correspond to the transactions he or she is trying to implement. This correspon-
`dence is represented formally as a function that maps each implementation step to the
`
`

`
`360
`
`S. Park and D. L. Dill
`
`transaction that it implements. We call this the transaction mapping M. Speciﬁcally, the
`transaction mapping maps each commit step to the appropriate atomic transaction; steps
`that are not commit steps for any transaction map to the idle speciﬁcation step.
`There is a natural correspondence between an implementation state q and the spec-
`iﬁcation state proj.q/ that is obtained by projecting away the implementation variables
`(and leaving the speciﬁcation variables untouched). When q is clean, proj.q/ is the spec-
`iﬁcation state that q implements. However, when q is not clean, proj.q/ is not necessarily
`meaningful, and the speciﬁcation state implemented by q is nonobvious.
`An execution of the implementation consists of a clean initial implementation state
`q and a ﬁnite sequence (cid:190) of implementation steps from F. The terminal implementation
`state is [(cid:190) ].q/, where [(cid:190) ] is the functional composition of the elements of (cid:190) in sequence.
`Corresponding to the implementation execution is an execution of the speciﬁcation.
`The initial state of the execution of the speciﬁcation is proj.q/, and the sequence of
`speciﬁcation steps is M.(cid:190) / (the sequence formed by applying the transaction mapping
`M elementwise to (cid:190) ). The terminal state of the speciﬁcation sequence is [M.(cid:190) /].proj.q//
`(the functional composition of the speciﬁcation steps, applied to the initial speciﬁcation
`state).
`Our veriﬁcation goal is to show that if the terminal implementation state is clean,
`projecting it onto the speciﬁcation variables yields the terminal speciﬁcation state. In
`other words,
`8(cid:190) 2 F⁄
`8q 2 Q;
`:
`Clean.q/ ^ Clean.[(cid:190) ].q// ) proj.[(cid:190) ].q// D [M.(cid:190) /].proj.q//:
`This deﬁnition of the veriﬁcation objective may raise several questions. First, why
`do we only require that clean states map to speciﬁcation states? Extending the deﬁnition
`to unclean states seems to result in more the deﬁnition of an aggregation function that we
`describe below. The goal of this deﬁnition is to establish a simpler and more basic notion
`of correctness that can be used to justify the soundness of the aggregation method.
`Second, what if an implementation sequence does not terminate in a clean state? If we
`verify the above property for all ﬁnite sequences, there will either be an extended sequence
`ending in a clean state where we can check the result, or there is no such sequence. Since
`it is always possible to stop initiating transactions, the latter condition indicates there is
`no way to complete the unﬁnished transactions (e.g., because of deadlock or livelock).
`In either case, a transaction is not executed incorrectly, although it may fail to complete.
`We have limited our attention to safety properties in this paper, so failure to complete
`transactions is beyond our scope of concerns. (Although we have not done it, we believe
`that a more general treatment of liveness as well as safety would not be difﬁcult.)
`
`(1)
`
`3. The Veriﬁcation Method
`
`In this section we deﬁne aggregation functions and show that the existence of an ag-
`gregation function is a sufﬁcient condition implying property (1) above. An aggregation
`function consists of two parts: a completion function which changes the state as though
`the transaction had completed, and a projection which hides the implementation vari-
`ables, leaving only the speciﬁcation variables.
`
`

`
`Veriﬁcation of Cache Coherence Protocols by Aggregation of Distributed Transactions
`
`361
`
`(2)
`
`(3)
`
`An aggregation function aggr maps implementation states to speciﬁcation states. In
`addition, aggr must satisfy two conditions. First, it must have a commutative property:
`8q 2 Q;
`8N 2 F:
`aggr.N .q// D M.N /.aggr.q//:
`This resembles the basic deﬁnition of an abstraction function, so we consider an aggre-
`gation function to be a particular type of abstraction function.
`The second property is that an aggregation function must be the same as projection
`for clean states:
`Clean.q/ ) aggr.q/ D proj.q/:
`8q 2 Q:
`A clean state is one where all transactions are complete. Therefore, the completion
`function in aggr should have no effect, leaving only the projection.
`Existence of an aggregation function implies the veriﬁcation objective (1). Because
`the aggregation function has the commutative property (2), it is easy to prove by induction
`that
`
`8(cid:190) 2 F⁄
`
`:
`
`8q 2 Q;
`aggr.[(cid:190) ].q// D [M.(cid:190) /].aggr.q//:
`This and property (3) directly imply veriﬁcation objective (1).
`Once a purported aggregation function has been deﬁned, the user must prove that it
`meets the commutative requirement (2). The proof can be done completely automatically
`or consists of a sequence of standard steps.1 The initial 8q and 8N can be eliminated
`automatically by substituting new symbolic constants throughout (when we do this in
`this presentation, we will not change the name of the quantiﬁed variable). This yields a
`subgoal of the form
`.N 2 F / ) aggr.N .q// D M.N /.aggr.q//:
`The number of subgoals is equal to the number of transition functions in the imple-
`mentation. In most cases, the required speciﬁcation step M.N / is the idle step; indeed,
`the only nonidle step is that which corresponds to the commit step in the implementation.
`We have no global strategy for proving these theorems, although most are very simple.
`The above discussion omits an important point, which is that not all states are
`worthy of consideration. Theorem (2) will generally not hold for some absurd states that
`cannot actually occur during a computation. Hence, it is usually necessary to provide
`an invariant predicate, which characterizes a superset of all the reachable states. If the
`invariant is Inv, Theorem (2) can then be weakened to
`8q 2 Q;
`8N 2 F:
`Inv.q/ ) aggr.N .q// D M.N /.aggr.q//:
`In other words, aggr only needs to commute when q satisﬁes the Inv. Use of an invariant
`incurs some additional proof obligations. First, we must prove that the initial states of
`the protocol satisfy Inv, and, second, that the implementation transition functions all
`preserve Inv.
`
`(4)
`
`(5)
`
`1 We base this comment on our use of the PVS theorem-prover, but we believe the same basic method
`would be used with others.
`
`

`
`362
`
`S. Park and D. L. Dill
`
`4. FLASH Cache Coherence Protocol
`
`The aggregation method has been used to verify the cache coherence protocol used in
`the Stanford FLASH multiprocessor [20], [15]. This section informally describes that
`protocol.
`The cache coherence protocol is directory-based so that it can support a large number
`of distributed processing nodes. Each cache line-sized block in memory is associated
`with a directory header which keeps information about the line. For a memory line, the
`node on which that piece of memory is physically located is called the home; the other
`nodes are called remote. The home maintains all the information about memory lines in
`its main memory in the corresponding directory headers.
`The system consists of a set of nodes, each of which contains a processor, caches, and
`a portion of the global memory. The distributed nodes communicate using asynchronous
`messages through a point-to-point network. The state of a cached copy is in either invalid,
`shared (readable), or exclusive (readable and writable).
`
`4.1.
`
`Informal Description of the Protocol
`
`If a read miss occurs in a processor, the corresponding node sends out a GET request to the
`home (this step is not necessary if the requesting processor is in the home). Receiving the
`GET request, the home consults the directory corresponding to the memory line to decide
`what action the home should take. If the line is pending, meaning that another request
`is already being processed, the home sends a NAK (negative acknowledgment) to the
`requesting node. If the directory indicates there is a dirty copy in a remote node, then the
`home forwards the GET to that node. Otherwise, the home grants the request by sending
`a PUT to the requesting node and updates the directory properly. When the requesting
`node receives a PUT reply, which returns the requested memory line, the processor sets
`its cache state to shared and proceeds to read.
`For a write miss, the corresponding node sends out a GETX request to the home.
`Receiving the GETX request, the home consults the directory. If the line is pending,
`the home sends a NAK to the requesting node. If the directory indicates there is a dirty
`copy in a third node, then the home forwards the GETX to that node. If the directory
`indicates there are shared copies of the memory line in other nodes, the home sends
`INVs (invalidations) to those nodes. At this point, the protocol depends on which of two
`modes the multiprocessor is running in: EAGER or DELAYED. In EAGER mode, the home
`grants the request by sending a PUTX to the requesting node; in DELAYED mode, this
`grant is deferred until all the invalidation acknowledgments are received by the home. If
`there are no shared copies, the home sends a PUTX to the requesting node and updates
`the directory properly. When the requesting node receives a PUTX reply which returns
`an exclusive copy of the requested memory line, the processor sets its cache state to
`exclusive and proceeds to write.
`During the read miss transaction, an operation called a sharing write-back is neces-
`sary in the following “three hop” case. This occurs when a remote processor in node R1
`needs a shared copy of a memory line, an exclusive copy of which is in another remote
`node R2. When the GET request from R1 arrives at the home H, the home consults the
`directory to ﬁnd that the line is dirty in R2. Then H forwards the GET to R2 with the
`
`

`
`Veriﬁcation of Cache Coherence Protocols by Aggregation of Distributed Transactions
`
`363
`
`Fig. 1. Processing of a read miss (a GET request) in the FLASH protocol.
`
`source of the message faked as R1 instead of H. When R2 receives the forwarded GET,
`the processor sets its copy to shared state and issues a PUT to R1. Unfortunately, the
`directory in H does not have R1 on its sharer list yet and the main memory does not have
`an updated copy when the cached line is in the shared state. The solution is for R2 to
`issue an SWB (sharing write-back) conveying the dirty data to H with the source faked
`as R1. When H receives this message, it writes the data back to main memory and puts
`R1 on the sharer list. Figure 1 shows the processing of a read miss in the protocol.
`When a remote node receives an INV, it invalidates its copy and then sends an
`acknowledgment to the home. There is a subtle case with an invalidation. A processor
`which is waiting for a PUT reply may get an INV before it gets the shared copy of the
`memory line, which is to be invalidated if the PUT reply is delayed. In such a case, the
`requested line is marked as invalidated, and the PUT reply is ignored when it arrives.
`A valid cache line may be replaced to accommodate other memory lines. A shared
`copy is replaced by issuing a replacement hint to the home, which removes the remote
`from its sharers list. An exclusive copy is written back to main memory by a WB (write-
`back) request to the home. Receiving the WB, the home updates the line in main memory
`and the directory properly.
`
`4.2. Detailed Description of the Protocol
`
`Each cache line-sized block in main memory is associated with a directory header which
`keeps information about the line. The directory header consists of several Boolean ﬂags:
`Local, Dirty, Pending, Head Valid, and List; a pointer to another node Head Pointer; a
`
`

`
`364
`
`S. Park and D. L. Dill
`
`set of pointers Sharer List;2 and the number of sharers in Real Pointers. The Local bit
`indicates if the local processor (in the home) contains a cached copy of the line in either
`shared or exclusive state. The Dirty bit is set if the home thinks that there is a dirty copy of
`the line in the system. The Pending bit is set if the current request for the memory line is
`being processed by a third node. The Head Valid bit indicates whether the Head Pointer
`contains a valid pointer to a node. The Head Pointer entry is simply a cache pointer that
`is stored in the directory header as an optimization. It keeps a pointer to a remote cache
`with a dirty copy if there is one, or one of the nodes with a shared copy. The List bit
`indicates whether Sharer List contains one or more pointers. Sharer List is represented
`abstractly as a set of pointers to the nodes that have a shared copy of the memory line.
`Real Pointers contains a count of the number of sharers in the Sharer List. This count
`excludes the Head Pointer and is mainly used to count invalidation acknowledgments.
`The FLASH protocol consists of a set of rules which are called handlers. Each
`handler is preﬁxed with NI (network interface) or PI (processor interface) to indicate
`where the requests are generated from. PI handlers are initiated by a requesting processor
`and NI handlers are initiated by a message from the network. The additional notation
`“Local” or “Remote” indicates whether the processing node is the home of the requested
`memory address or not. In the following, some handlers of the protocol for processing
`a read miss are listed; the rest of the handlers are shown in the Appendix.
`† PI.Local.Get: this handler describes actions of the home when the local processor
`needs a shared copy of a memory line. If Pending,3 the local processor is NAKed.
`Otherwise, if Dirty, the home sends a GET request to Head Pointer and Pending is
`set. Otherwise, the data in main memory is copied into the local cache (in shared
`state) and Local is set.
`† NI.Remote.Get: this handler describes actions of a remote node receiving a GET
`request. If the cached data in the remote node is in exclusive state, it is changed
`to shared and the node sends a PUT reply to the source of the GET (and also SWB
`to the home if the source is not the home). Otherwise, the node sends an NAK to
`the source and an NAKC (NAK clear) to the home.
`† NI.Local.Put: this handler processes a PUT reply to the home. Local is set, Dirty
`and Pending are reset, and the shared copy is put into the local cache.
`† NI.SharingWriteback: this handler describes actions of the home receiving an
`SWB. Dirty and Pending are reset, List is set, Real Pointers is incremented, the
`source is added to Sharer List, and the data is written back into main memory.
`
`5. Veriﬁcation by Aggregating Distributed Transactions
`
`Using the aggregation method, we have formally veriﬁed the protocol at the level of
`its formal description [35]. The protocol consisting of more than a hundred different
`implementation steps has been reduced to a model with only six kinds of atomic transac-
`
`2 Actually, the set of pointers is implemented as a linked list for sharers (within the home) by dynamic
`pointer allocation [39].
`3 That is, the Pending bit is set in the directory.
`
`

`
`Veriﬁcation of Cache Coherence Protocols by Aggregation of Distributed Transactions
`
`365
`
`tions. Based on the reduced atomic behavior, it is very easy to reason about the protocol,
`checking safety properties and data consistency of cached copies.
`In the following, we illustrate how the protocol is reduced to an atomic model by an
`aggregation function. The detailed proofs are conﬁrmed by a theorem-prover and some
`techniques to simplify the proof are presented.
`
`5.1. Extracting Reduced Model of the Protocol
`
`Veriﬁcation requires two descriptions of the same behavior: an implementation and a
`speciﬁcation. Sometimes, there is an a priori speciﬁcation as in the memory model
`veriﬁcation in the next section. However, in most practical instances, there is only an
`implementation. In such cases, we extract a reduced model of the implementation using
`aggregation. The reduced model concisely captures the behavior of the implementation
`and serves as a speciﬁcation.
`Recall that to use the aggregation method, we ﬁrst decide which state variables
`should be considered speciﬁcation variables. In cache coherence protocols, the consis-
`tency of multiple copies of a memory line is a function of the values and states of cached
`copies, and the corresponding value in main memory. Therefore, the speciﬁcation vari-
`ables should be the state variables representing the data and states of cached copies and
`the data in main memory.
`We construct a reduced model of the protocol, which we use for a speciﬁcation. The
`procedure is to trace through a transaction: (1) concatenating the implementation steps,
`(2) simplifying by substituting values forw

This document is available on Docket Alarm but you must sign up to view it.

Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

Up-to-date information for this case.
Email alerts whenever there is an update.
Full text search for other cases.
Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.

Access Government Site

We are redirecting you
to a mobile optimized page.

Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket

Supplemental Search

Search for PTAB Motions

PTAB Analytics

TTAB Analytics

Basic Search

Filters

Party Search

Advanced

Selected Courts

Recently Selected Courts

Find PTAB Decisions

PTAB Analytics

Special PTAB Alerts

Orange Book

Directly Search Federal Courts

Search Trademark ...

This document is available on Docket Alarm but you must sign up to view it.

Accessing this document will incur an additional charge of $.

Still Working On It

A few More Minutes ... Still Working

This document could not be displayed.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

One Moment Please

Your document is on its way!

Sealed Document

We are redirecting youto a mobile optimized page.

Document Unreadable or Corrupt

We are unable to display this document.

STEP 2 of 2

Choose your membership type

Flat-Fee

Pay-As-You-Go

Add your payment information

Login or Join

Enter your corporate Email

Thousands of your peers are saving time and gaining a competitive advantage with Docket Alarm.

Join Docket Alarm to perform smarter legal research.

Download this document and millions of others instantly with a Docket Alarm membership.

Join Docket Alarm and start performing smarter legal research.

Start tracking this docket instantly with a Docket Alarm membership.

Join thousands of your peers and start performing smarter legal research.

STEP 1 of 2

Millions of Documents | 15 Seconds to Signup

Hi !

Welcome to Docket Alarm

Welcome to Docket Alarm!

Explore Litigation Insights andManage Your Cases

Reset Password

What is PACER?

Why do I need it?

What will I be charged?

Do other courts have fees?

Basic Free Access

Welcome

Thank you

Check Firm Account

We are redirecting you
to a mobile optimized page.

Explore Litigation Insights and
Manage Your Cases