`
`
`In re Patent of: Morton et al.
`
`U.S. Patent No.: 7,296,121
`Issue Date:
`Nov. 13, 2007
`Appl. Serial No.: 10/966,161
`Filing Date: Oct. 15, 2004
`Title: REDUCING PROBE TRAFFIC IN MULTIPROCESSOR SYSTEMS
`
`
`
`
`
`
`
`
`
`
`
`
`Case Nos.: IPR2015-00159
`
`
`IPR2015-00163
`
`
`
`
`
`
`
`OPPOSITION DECLARATION OF DR. ROBERT HORST
`
`1.
`
`I have reviewed the “Patent Owner Motion to Amend” in IPR2015-
`
`00159, the “Patent Owner Motion to Amend” in IPR2015-00163 and the
`
`“Declaration of Vojin Oklobdzija, Ph.D. in Support of Patent Owner’s Motion to
`
`Amend,” each filed on August 11, 2015. I also considered the references cited
`
`herein, including, for example: Michael John Sebastian Smith, APPLICATION-
`
`SPECIFIC INTEGRATED CIRCUITS (1997) (“Smith”) (Ex. 1008); Deposition
`
`Transcript of Dr. Vojin G. Oklobdzija Vol. 1, November 23, 2015 (Ex. 1026);
`
`Deposition Transcript of Dr. Vojin G. Oklobdzija Vol. 2, November 24, 2015 (Ex.
`
`1027); David E. Culler et al., PARALLEL COMPUTER ARCHITECTURE: A HARD-
`
`WARE/SOFTWARE APPROACH (1st Ed.) (1998) (Ex. 1028); “InfiniBand Architecture
`
`Specification Volume 1 Release 1.0.a” (June 19, 2001) (Ex. 1029); and James
`
`Laudon and Daniel Lenoski, Proceedings of the 24th Annual International
`
`Symposium on Computer Architecture, “The SGI Origin: A ccNUMA Highly
`
`Scalable Server” (1997) (Ex. 1030). In my declaration, I am applying the
`
`Page 1 of 18
`
`Sony Corporation v. Memory
`Integrity, LLC
`IPR2015-00158
`EXHIBIT
`Sony-1020
`
`
`
`Attorney Docket No.: 39521-0007IP1
`U.S. Patent No. 7,296,121
`standards and legal principles that I applied when drafting the declaration entitled
`
`“Declaration of Dr. Robert Horst” dated October 28, 2014, which were outlined in
`
`paragraphs 8 and 40-61 of that document. Based on these principles and my
`
`expertise in the relevant technology, I provide the following description of prior art
`
`relevant to the amended claims.
`
`I.
`
`2.
`
`The Culler Book and Laudon (Relevant to Claims 26-28)
`
`The Culler Book includes a case study of the SGI Origin architecture.
`
`See Ex. 1028, p. 596. Similarly, Laudon is titled “The SGI Origin: A ccNUMA
`
`Highly Scalable Server” and describes the same SGI Origin architecture as
`
`described in the Culler Book’s case study. Ex. 1030, p. 1. Both the Culler Book
`
`and Laudon share similar system diagrams, particularly with regard to the Hub
`
`chip, which acts as the probe filtering unit in the SGI Origin architecture and is
`
`shown identically in FIG. 8.21 of the Culler Book and FIG. 6 of Laudon. See Ex.
`
`1028, p. 616; Ex. 1030, p. 245. Accordingly, it would have been obvious to a
`
`person of ordinary skill in the art to combine the teachings of the Culler Book and
`
`Laudon, as the combination of these references would have provided a more
`
`complete and confirmatory teaching of the SGI Origin architecture.
`
`3.
`
`The Culler Book describes that “[t]he Origin system is composed of a
`
`number of processing nodes connected by switch-based interconnection network.”
`
`Ex. 1028, p. 597. As shown in FIG. 3 of Laudon (reproduced below), “[t]he
`
`Page 2 of 18
`
`
`
`Attorney Docket No.: 39521-0007IP1
`U.S. Patent No. 7,296,121
`interconnection network has a hypercube topology” is one form of the “scalable
`
`point-to-point interconnection network[s]” on which the directory-based coherence
`
`protocols of Chapter 8 of the Culler Book are based. Ex. 1028, pp. 553, 597, 615;
`
`Ex. 1030, p. 243; see also Ex. 1026, 132:24-134:1 (admitting that a switch-based
`
`network is a “point-to-point architecture”). According to the Culler Book, “[e]very
`
`processing node contains two MIPS R10000 processors, each with first- and
`
`second-level caches, a fraction of the total main memory of the machine, an I/O
`
`interface, and a single communication assist or cache coherence controller, called
`
`the Hub, that implements the coherence protocol.” Ex. 1028, p. 597. Though the
`
`Culler Book describes each processing node as containing two processors, Laudon
`
`describes that “[e]ach node consists of one or two R10000 processors,” and the
`
`Culler Book even “assumes for simplicity that each node contains only one
`
`processor” when discussing the cache coherence protocol. Ex. 1030, p. 241; Ex.
`
`1028, p. 597. Accordingly, it would have been obvious to a POSITA to implement
`
`an SGI Origin machine in which each of multiple nodes has only a single
`
`processor, and the following section applies such a configuration to claims 26-34.
`
`Page 3 of 18
`
`
`
`Attorney Docket No.: 39521-0007IP1
`U.S. Patent No. 7,296,121
`
`Point‐to‐Point Link Between
`Hub Chips (i.e., a Point‐to‐
`Point Architecture)
`
`
`Ex. 1030, p. 243 (showing point-to-point links between Hub chips, though, in the
`
`proposed combination, each “R” is a Hub chip and would only be associated with a
`
`single processor).
`
`4. Moreover, an SGI Origin system may contain “up to 512 nodes.” Ex.
`
`1028, p. 612. The following discussion assumes a combination of the Culler Book
`
`and Laudon with four single-processor nodes. However, the proposed combination
`
`would be equally applicable for systems containing more or less than four nodes.
`
`Following is an adaptation of FIG. 8.15 of the Culler Book that illustrates the
`
`proposed combination and annotates the relevant components.
`
`Page 4 of 18
`
`
`
`Attorney Docket No.: 39521-0007IP1
`U.S. Patent No. 7,296,121
`
`MIPS R1000 Processors
`(i.e., Plurality of Processing Nodes)
`
`Local Node
`
`Home Node
`
`Home Hub
`(i.e., PFU)
`
`Owner Node
`
`Owner Node
`
`Hypercube
`(i.e., point‐to‐point architecture)
`
`
`
`5.
`
`As described in the Culler Book, each of the processors in the SGI
`
`Origin architecture is associated with “first- and second-level caches.” Ex. 1028,
`
`pp. 597, 612. The caches of an SGI Origin machine “use[] the same MESI states
`
`as used in Chapter 5” of the Culler Book, which are “modified (M) or dirty,
`
`exclusive-clean (E), shared (S), and invalid (I).” Ex. 1028, pp. 598, 299. In the
`
`directory referenced by the Hub chip, these MESI states are represented by one of
`
`seven directory states. Ex. 1028, p. 598. For example, the “shared” directory state
`
`indicates “zero or more read-only cached copies whose whereabouts are indicated
`
`Page 5 of 18
`
`
`
`Attorney Docket No.: 39521-0007IP1
`U.S. Patent No. 7,296,121
`by [a] presence bit vector.” Id. In this “shared” directory state, the presence bit
`
`vector represents in which processors a memory line is stored in the shared (S)
`
`cache state and in which processors a memory line is in the not-present state or the
`
`invalid (I) state. On the other hand, “[a]n exclusive directory state means the block
`
`may be in either dirty or (clean) exclusive state in the cache (i.e., either the M or E
`
`states of the MESI protocol.” Ex. 1028, p, 598. Thus, the information in the
`
`directory referenced by the Hub chip are representative of each of the four MESI
`
`states.
`
`6.
`
`Additionally, each of the processors in the SGI Origin architecture
`
`supports “accesses that are under the control of the coherence protocol.” Ex. 1028,
`
`p. 607, n. 3. In addition, each “processor also supports memory operations that are
`
`not visible to the coherence protocol, called noncoherent memory operations, for
`
`which the system does not guarantee any ordering.” Id. These noncoherent
`
`operations include “uncached memory operations, I/O operations, and special
`
`synchronization support.” Ex. 1028, p. 604]]. For example, a processor can use
`
`uncached references to a special I/O address space to “reference any physical I/O
`
`device in the machine.” Ex. 1028, p. 604.
`
`7. Moreover, “[a]ll cache misses, whether to local or remote memory, go
`
`through the Hub (which implements the coherence protocol), as do all uncached
`
`operations.” Ex. 1028, p. 612. In other words, as shown in the following
`
`Page 6 of 18
`
`
`
`Attorney Docket No.: 39521-0007IP1
`U.S. Patent No. 7,296,121
`annotated version of FIG. 8.21 of the Culler Book, both a processor and a Hub in
`
`the SGI Origin architecture are capable of communicating coherent messages (e.g.,
`
`cache misses) and noncoherent messages (e.g., uncached I/O operations) to other
`
`components in the machine (e.g., each other), meaning that each contains a
`
`“coherent protocol interface” and “non-coherent protocol interface,” by Patent
`
`Owner’s own construction of these terms. See Motion to Amend, p. 16. In fact, it
`
`was common to support noncoherent accesses for I/O. When writing to an I/O
`
`device, the application in the processor must gain exclusive access to the entire I/O
`
`buffer. If the I/O buffer is implemented with a coherent cache, there is no benefit
`
`from having individual cache lines owned by different processors. Networks
`
`supporting both coherent and noncoherent accesses included HyperTransport as
`
`documented by the first reference under “Other Publications” on the face of the
`
`‘121 patent: HyperTransportTM Link Specification Revision 1.03,
`
`HyperTransportTM Consortium, Oct. 10, 2001, Copyright © 2001 HyperTransport
`
`Technology Consortium, pages 38-40.
`
`Page 7 of 18
`
`
`
`Attorney Docket No.: 39521-0007IP1
`U.S. Patent No. 7,296,121
`
`Links that couple Hub to processor,
`which supports both coherent and
`non‐coherent messages.
`See Ex. 1028, p. 607, n. 3
`
`Hub chip’s interface to processors (PI), which
`“implements the coherence protocol” and
`through which “all uncached operations” (i.e.,
`non‐coherent messages) go.
`See Ex. 1028, p. 612.
`
`
`
`8.
`
`The following adaptation of FIG. 8.15 of the Culler Book illustrates
`
`the communication of two read requests (i.e., Request A and B) in the
`
`implementation of the SGI Origin system of the proposed combination. This flow
`
`is described in greater detail below.
`
`Page 8 of 18
`
`
`
`Attorney Docket No.: 39521-0007IP1
`U.S. Patent No. 7,296,121
`
`
`
`Probes Correspond to
`Memory Lines
`(i.e., Read Requests)
`
`Selected Ones of the
`Processing Nodes
`(i.e., Owners)
`
`Probe Filtering Unit
`(i.e., Home Hub)
`
`Request A
`Request B
`
`
`
`9.
`
`“The Hub chip is the heart of the machine.” Ex. 1028, p. 612.
`
`Laudon describes that the Hub chip in the SGI Origin architecture is an application
`
`specific integrated circuit (ASIC), which is a type of integrated circuit. Ex. 1030,
`
`p. 245. The Hub chip “implements the coherence protocol.” Id. The Hub chip
`
`“must . . . coordinate the activities and dependences of all the different types of
`
`transactions that flow through it from different components and implement the
`
`necessary pathways and control.” Ex. 1028, p. 614. “The Hub is divided into four
`
`major interfaces, one for each type of external entity that it connects together: the
`
`processor interface or PI, the memory/directory interface or MI, the network
`
`interface or NI, and the I/O interface or II (see Figure 8.21).” Ex. 1028, p. 615.
`
`Page 9 of 18
`
`
`
`Attorney Docket No.: 39521-0007IP1
`U.S. Patent No. 7,296,121
`10. Each of the interfaces of the Hub chip “communicate with one another
`
`through an on-chip crossbar switch.” Id. “A key property of the design is for each
`
`interface to shield its external entity from the details of other interfaces and entities
`
`(and vice versa).” Id. One example of this shielding is that, during read requests,
`
`the directory interface “treats a cache at the home just like any other cache; the
`
`only difference is that a ‘message’ between a home directory and a cache at home
`
`does not translate to a network transaction.” Ex. 1028, p. 599. This means that the
`
`processor and its associated case in the home node will not receive a read request
`
`sent to the home node’s Hub chip, unless the memory/directory interface of the
`
`Hub chip determines that the cache of the processor in the home node owns the
`
`requested cache line.
`
`11. For clarity, it is worth considering how a read request issued by a
`
`processor in a local/requesting node for a memory line associated with a separate
`
`home node flows through the proposed SGI Origin system when the requested
`
`memory line is owned in the Modified (M) or Exclusive (E) state by a processor in
`
`a separate owner node. According to the Culler Book, when “a processor issues a
`
`read that misses in its cache hierarchy . . . [, t]he address of the miss is examined
`
`by the local Hub to determine the home node, and a read request transaction is sent
`
`to the home node to look up the directory entry.” Ex. 1028, p. 599. This is
`
`illustrated in the following annotation of FIG. 8.21 of the Culler book, where A is
`
`Page 10 of 18
`
`
`
`Attorney Docket No.: 39521-0007IP1
`U.S. Patent No. 7,296,121
`the read request. This message flow corresponds to the yellow and red arrows
`
`from the local/requesting process to the Hub chip of the home node in the above
`
`adaptation of FIG. 8.15 of the Culler Book.
`
`A
`
`Local/Requesting Hub
`
`
`
`12. According to the Culler Book, “[a]t the home, the data for the block is
`
`accessed speculatively in parallel with looking up the directory entry.” Ex. 1028,
`
`p. 599. In other words, the memory/directory interface of the Hub chip of the
`
`home node also accesses the main memory of the home node to speculatively
`
`retrieve the requested data from main memory in addition to looking up the
`
`directory entry.
`
`Page 11 of 18
`
`
`
`Attorney Docket No.: 39521-0007IP1
`U.S. Patent No. 7,296,121
`13. The memory/directory interface of the Hub chip of the home node
`
`contains a directory interface, which “contains the logic and tables that determine
`
`what protocol actions to take and hence implement the coherence protocol.” Ex.
`
`1028, p. 617. The directory stores directory information, including states, for each
`
`memory block stored in the memory of the node. See Ex. 1028, pp. 598, 609.
`
`14.
`
`“At the directory, a block may be in one of seven states” including
`
`“unowned, or no cached copies in the system; shared, that is, zero or more read-
`
`only cached copies whose whereabouts are indicated by the presence vector; and
`
`exclusive, or one read-write cached copy in the system . . . .” Ex. 1028, p. 598
`
`(emphasis in original). “An exclusive directory state means the block may be in
`
`either dirty or (clean) exclusive state in the cache (i.e., either the M or E states of
`
`the MESI protocol).” Id. When a request memory block is in the exclusive
`
`directory state (i.e., either the M or E states of the MESI protocol) and the home is
`
`not the owner of the block, “the valid data for the block must be obtained from the
`
`owner and must find its way to the requestor as well as to the home (since the state
`
`will change to shared).” Ex. 1028, p. 599.
`
`15.
`
`“The Origin protocol uses reply forwarding; the request is forwarded
`
`to the owner, which replies directly to the requestor, sending a revision message to
`
`the home” node. Id. In other words, the same request that was received by the
`
`Hub chip of the home node is forwarded to the owner processor. Importantly, in
`
`Page 12 of 18
`
`
`
`Attorney Docket No.: 39521-0007IP1
`U.S. Patent No. 7,296,121
`the example described here where the requested memory line is owned by a
`
`processor in a different node, neither the processor in the home node nor its
`
`associated cache receive the read request, because the interface of the Hub chip
`
`shields its external entity from the details of other interfaces and entities.
`
`16. Moreover, the Culler Book describes that, “[i]f a block is in an
`
`exclusive state (i.e., modified or exclusive) in a processor cache, then the rest of
`
`the directory entry is not a bit vector with one bit turned on but rather contains an
`
`explicit pointer to the specific processor (not node).” Ex. 1028, p. 609. The
`
`memory/directory interface of the home Hub chip uses this pointer to address the
`
`read request to the owner node, and the network interface of the home Hub chip
`
`uses the address to forward the read request only to the specifically addressed
`
`processor. See Ex. 1028, pp. 617-18.
`
`17. The handling of a read request by the home Hub chip is illustrated in
`
`the following annotation of FIG. 8.21 of the Culler book, where A-in is the read
`
`request received from the local processor, A-out illustrates that “the request is
`
`forwarded to the owner,” and Spec is the speculative memory read returned to the
`
`Hub chip of the local node. In the above adaptation of FIG. 8.15 of the Culler
`
`Book, this message flow corresponds to the yellow and red arrows from the Hub
`
`chip of the home node to the processor indicated in the directory information to be
`
`the owner of the memory line.
`
`Page 13 of 18
`
`
`
`Attorney Docket No.: 39521-0007IP1
`U.S. Patent No. 7,296,121
`
`A‐in
`
`A‐out
`
`Spec
`
`Remote Home Hub
`
`
`
`18. According to the Culler book, when the owner processor receives the
`
`Does not
`receive request
`
`request, the Hub chip associated with the owner ensures that a reply is sent to the
`
`local requesting processor and a revision message is sent to the Hub chip of the
`
`home node so that it can update the state information in the directory. See Ex.
`
`1028, pp. 599-600, 617-18. Because a read request in the Origin system elicits a
`
`response from the owner processor to maintain cache coherency in the system
`
`(e.g., the revision message sent to the home), the read request is a probe.
`
`19. As described above, the memory/directory interface of the home
`
`node’s Hub chip determines the owner processor with reference to the directory
`
`Page 14 of 18
`
`
`
`Attorney Docket No.: 39521-0007IP1
`U.S. Patent No. 7,296,121
`information stored in the directory and forwards the read request only to that owner
`
`processor. Accordingly, the Hub chip of the home node (probe filtering unit) is
`
`operable to receive read requests (i.e., probes corresponding to memory lines) from
`
`any of the processors of the system (i.e., the processing nodes) and to transmit the
`
`read requests only to selected ones of the processors that own the requested data
`
`(i.e., only to selected ones of the processing nodes) with reference to directory
`
`information (i.e., probe filtering information) representative of states associated
`
`with selected ones of the cache memories.
`
`II.
`
`The Culler Book in view of Laudon and Smith (Relevant to
`Claims 29-34)
`20. As I described previously, the Hub chip of the home node described
`
`by the Culler Book and Laudon is a probe filtering unit. Laudon describes that the
`
`Hub chip is an application specific integrated circuit (ASIC). Ex. 1030, p. 245.
`
`The Culler Book and Laudon do not specifically describe the process of designing
`
`and building an ASIC such as the Hub chip. However, Smith describes the process
`
`for designing and building an application-specific integrated circuit (ASIC). See
`
`Ex. 1008, p. 1.
`
`21. According to Smith, “Figure 1.10 shows the sequence of steps to
`
`design an ASIC; we call this a design flow.” Ex. 1008, p. 16. The steps included
`
`in FIG. 1.10 are as follows:
`
`Page 15 of 18
`
`
`
`Attorney Docket No.: 39521-0007IP1
`U.S. Patent No. 7,296,121
`1. Design entry. Enter the design into an ASIC design system, either
`using a hardware description language (HDL) or schematic entry.
`2. Logic synthesis. Use an HDL (VHDL or Verilog) and a logic
`synthesis tool to produce a netlist - a description of the logic cells and
`their connections.
`3. System partitioning. Divide a large system into ASIC-sized pieces.
`4. Prelayout simulation. Check to see if the design functions correctly.
`5. Floorplanning. Arrange the blocks of the netlist on the chip.
`6. Placement. Decide the locations of cells in a block.
`7. Routing. Make the connections between cells and blocks.
`8. Extraction. Determine the resistance and capacitance of the
`interconnect.
`9. Postlayout simulation. Check to see the design still works with the
`added loads of the interconnect.
`Id. at pp. 16-18. Accordingly, as part of designing an ASIC, such as the Hub chip
`
`of the home node in the SCI Origin architecture described by the Culler Book and
`
`Laudon, Smith describes using design software to layout the design and simulate it.
`
`See id. One of ordinary skill in the art would understand that this design software
`
`and the design it produces are data structures stored on one or more computer
`
`readable mediums.
`
`22. Accordingly, the Culler Book in view of Laudon and Smith discloses
`
`that, as part of designing an ASIC, at least one computer-readable medium has data
`
`structures stored therein representative of the Hub chip of the home node.
`
`According to Smith, the data structures comprise a simulatable representation of
`
`Page 16 of 18
`
`
`
`Attorney Docket No.: 39521-0007IP1
`U.S. Patent No. 7,296,121
`the Hub chip of the home node, and the simulatable representation comprises a
`
`netlist. See Ex. 1008, pp. 17-18. Additionally, Smith describes that the data
`
`structures comprise a code description of the Hub chip of the home node and that
`
`the code description corresponds to a hardware description language. See id.
`
`23. Moreover, Smith describes that masks are the tooling used to
`
`manufacture an ASIC. See Ex. 1008, p. 28. Thus, because the Hub chip of the
`
`home node in SGI Origin architecture described in the Culler Book and Laudon
`
`was implemented as an ASIC, as described above, Smith teaches that there must be
`
`a set of semiconductor processing masks representative of at least a portion of the
`
`memory controller.
`
`24.
`
`It would have been obvious to a POSITA to combine the teachings of
`
`Smith with the combination of Culler Book in view of Laudon, because the Hub
`
`chip of the SGI Origin system is described as an ASIC, and the Smith reference
`
`describes general procedures for designing, implementing, and testing an ASIC.
`
`III. Conclusion
`
`I hereby declare that all statements made herein of my own knowledge are
`
`true and that all statements made on information and belief are believed to be true;
`
`and further that these statements were made with the knowledge that willful false
`
`statements and the like so made are punishable by fine or imprisonment, or both,
`
`under Section 1001 of Title 18 of the United States Code.
`
`Page 17 of 18
`
`
`
`Attorney Docket No.: 39521-0007lP1
`U.S. Patent No. 7,296,121
`
`Signature:
`
`Robert Horst, PhD
`
`Date: /lg///5/
`
`Page 18 of 18