`Conference on Computer Design VLSI in Computers and Processors, 1997 (“Zhang”)
`
`As described in the following claim chart, the asserted claims of the ’867 patent are invalid in view of “Architectural Adaptation for
`Application-Specific Locality Optimizations,” Xingbin Zhang et al., International Conference on Computer Design VLSI in
`Computers and Processors, 1997 (“Zhang”).
`
`The citations presented herein are exemplary and not exclusive; each prior art reference as a whole discloses each and every limitation
`of the claims. A citation to a figure or figure reference numeral incorporates by reference the discussion and/or explication of such
`figure or feature/component referenced by the reference numeral. Further, the mapping in this chart is based on Amazon’s present
`understanding of Plaintiffs’ interpretation of the asserted claims of the patent-in-suit as reflected in Plaintiffs’ infringement
`contentions. Nothing in the chart should be regarded as necessarily reflecting how the prior art references would apply to claim
`elements of the asserted patent under a proper interpretation of the claims. Disclosures cited for dependent claims incorporate by
`reference the disclosure included herein for the corresponding independent claim.
`
`’867 patent claim 1
`
`Zhang
`
`A reconfigurable processor that instantiates an
`algorithm as hardware comprising:
`
`Zhang discloses a reconfigurable processor that instantiates an algorithm as
`hardware. For example:
`
`Zhang discloses an architecture with small blocks of programmable logic (e.g.,
`a reconfigurable processor) that can match an application, e.g., performing
`sparse matrix computations (e.g., instantiate an algorithm as hardware). Page
`151 (“We propose an architecture that integrates small blocks of programmable
`logic into key elements of a baseline architecture, including processing elements,
`components of the memory hierarchy, and the scalable interconnect, to provide
`architectural adaptation - the customization of architectural mechanisms and
`policies to match an application. … Using sparse matrix computations as examples,
`our results show that customization for application-specific optimizations can bring
`significant performance improvement (10X reduction in miss rates, 100X reduction
`in data traffic), and that an application-driven machine customization provides a
`promising approach to achieve robust, high performance.”)
`
`Patent Owner Saint Regis Mohawk Tribe
`Ex. 2022, p. 1
`
`
`
`
`
`Exhibit B-4: “Architectural Adaptation for Application-Specific Locality Optimizations,” Xingbin Zhang et al., International
`Conference on Computer Design VLSI in Computers and Processors, 1997 (“Zhang”)
`
`a first memory having a first characteristic
`memory bandwidth and/or memory
`utilization;
`
`Zhang discloses a first memory having a first characteristic memory bandwidth
`and/or memory utilization. For example:
`
`Zhang discloses cache memories, including an L1 cache which has a transfer
`rate of 16B/5 cycles (e.g., a characteristic memory bandwidth). Page 152 (“As
`an example of its flexibility, MORPH could be used to implement either a cache-
`coherent machine, a non-cache coherent machine, or even clusters of cache
`coherent machines connected by put/get or message passing. In this paper, we
`focus on architectural adaptation in the memory system for locality optimizations
`such as latency tolerance.”); page 153 (“Figure 4 shows the prefetcher
`implementation using programmable logic integrated with the L1 cache.”)
`
`2
`
`Patent Owner Saint Regis Mohawk Tribe
`Ex. 2022, p. 2
`
`
`
`Exhibit B-4: “Architectural Adaptation for Application-Specific Locality Optimizations,” Xingbin Zhang et al., International
`Conference on Computer Design VLSI in Computers and Processors, 1997 (“Zhang”)
`
`
`
`
`
`3
`
`Patent Owner Saint Regis Mohawk Tribe
`Ex. 2022, p. 3
`
`
`
`Exhibit B-4: “Architectural Adaptation for Application-Specific Locality Optimizations,” Xingbin Zhang et al., International
`Conference on Computer Design VLSI in Computers and Processors, 1997 (“Zhang”)
`
`
`
`
`
`4
`
`Patent Owner Saint Regis Mohawk Tribe
`Ex. 2022, p. 4
`
`
`
`Exhibit B-4: “Architectural Adaptation for Application-Specific Locality Optimizations,” Xingbin Zhang et al., International
`Conference on Computer Design VLSI in Computers and Processors, 1997 (“Zhang”)
`
`
`
`and a data prefetch unit coupled to the first
`memory, wherein the data prefetch unit
`retrieves only computational data required by
`the algorithm from a second memory of
`second characteristic memory bandwidth
`and/or memory utilization and places the
`retrieved computational data in the first
`memory wherein the data prefetch unit
`operates independent of and in parallel with
`logic blocks using the computional [sic] data,
`and wherein at least the first memory and data
`prefetch unit are configured to conform to
`needs of the algorithm, and the data prefetch
`unit is configured to match format and
`location of data in the second memory.
`
`
`Zhang discloses a data prefetch unit coupled to the first memory, wherein the data
`prefetch unit retrieves only computational data required by the algorithm from a
`second memory of second characteristic memory bandwidth and/or memory
`utilization and places the retrieved computational data in the first memory wherein
`the data prefetch unit operates independent of and in parallel with logic blocks
`using the computational data, and wherein at least the first memory and data
`prefetch unit are configured to conform to needs of the algorithm, and the data
`prefetch unit is configured to match format and location of data in the second
`memory. For example:
`
`Zhang discloses a prefetcher (e.g., a data prefetch unit) implemented by
`programmable logic and integrated with the L1 cache (e.g., coupled to the first
`memory). (“Figure 4 shows the prefetcher implementation using programmable
`logic integrated with the L1 cache. The prefetcher requires two pieces of
`application-specific information: the address ranges and the memory layout of the
`target data structures. The address range is needed to indicate memory bounds
`
`5
`
`Patent Owner Saint Regis Mohawk Tribe
`Ex. 2022, p. 5
`
`
`
`
`
`Exhibit B-4: “Architectural Adaptation for Application-Specific Locality Optimizations,” Xingbin Zhang et al., International
`Conference on Computer Design VLSI in Computers and Processors, 1997 (“Zhang”)
`
`where prefetching is likely to be useful. This is application dependent, which we
`determined by inspecting the application program, but can easily be supplied by the
`compiler. The program sets up the required information and can enable or disable
`prefetching at any point of the program. Once the prefetcher is enabled, however, it
`determines what and when to prefetch by checking the virtual addresses of cache
`lookups to check whether a matrix element is being accessed.”)
`
`
`Zhang discloses that the prefetcher accesses data that will be useful to the
`application program (e.g., only computational required by the algorithm)
`from an L2 cache (e.g., the second memory). Page 153 (“The prefetcher requires
`two pieces of application-specific information: the address ranges and the memory
`layout of the target data structures. The address range is needed to indicate memory
`bounds where prefetching is likely to be useful. This is application dependent,
`which we determined by inspecting the application program, but can easily be
`supplied by the compiler. The program sets up the required information and can
`enable or disable prefetching at any point of the program. Once the prefetcher is
`
`6
`
`Patent Owner Saint Regis Mohawk Tribe
`Ex. 2022, p. 6
`
`
`
`
`
`Exhibit B-4: “Architectural Adaptation for Application-Specific Locality Optimizations,” Xingbin Zhang et al., International
`Conference on Computer Design VLSI in Computers and Processors, 1997 (“Zhang”)
`
`enabled, however, it determines what and when to prefetch by checking the virtual
`addresses of cache lookups to check whether a matrix element is being accessed.”);
`page 154 (“Our second case study uses a sparse matrix-matrix multiply routine as
`an example to show architectural adaptation to improve data reuse and reduce data
`traffic between the memory unit and the processor. The architectural customization
`aims to send only used fields of matrix elements during a given computation to
`reduce bandwidth requirement using dynamic scatter and gather. … The cache, in
`order to avoid conflict misses, is split into two parts: one small part acting as a
`standard cache for other requests and one part for the prefetched matrix elements
`only. The latter part has an application-specific management policy, and can be
`distinguished by mapping it to a reserved address space.”)
`
`Zhang discloses that the L2 cache (e.g., the second memory) has a transfer rate
`of 8B/5 cycles (e.g., a characteristic memory bandwidth).
`
`
`
`7
`
`Patent Owner Saint Regis Mohawk Tribe
`Ex. 2022, p. 7
`
`
`
`
`
`Exhibit B-4: “Architectural Adaptation for Application-Specific Locality Optimizations,” Xingbin Zhang et al., International
`Conference on Computer Design VLSI in Computers and Processors, 1997 (“Zhang”)
`
`Zhang discloses that the prefetcher and L1 cache has an application-specific
`memory policy (e.g., are configured to confirm to the needs of the algorithm)
`and that the prefetcher uses pointer chasing and packing/gathering to retrieve
`data from the second memory (e.g., matches the format and location of data in
`the second memory). Page 154 (“Our second case study uses a sparse matrix-
`matrix multiply routine as an example to show architectural adaptation to improve
`data reuse and reduce data traffic between the memory unit and the processor. The
`architectural customization aims to send only used fields of matrix elements during
`a given computation to reduce bandwidth requirement using dynamic scatter and
`gather. … The cache, in order to avoid conflict misses, is split into two parts: one
`small part acting as a standard cache for other requests and one part for the
`prefetched matrix elements only. The latter part has an application-specific
`management policy, and can be distinguished by mapping it to a reserved address
`space. The two main ideas are prefetching of whole rows or columns using pointer
`chasing in the memory module and packing/gathering of only the used fields of the
`matrix element structure. When the root pointer of a column or row is accessed, the
`gather logic in the main memory module chases the row or column pointer to
`retrieve different matrix elements and forwards them directly to the cache. The
`cache, in order to avoid conflict misses, is split into two parts: one small part acting
`as a standard cache for other requests and one part for the prefetched matrix
`elements only. The latter part has an application-specific management policy, and
`can be distinguished by mapping it to a reserved address space. …Because the data
`gathering changes the storage mapping of matrix elements, in order not to change
`the program code, a translate logic in the cache is required to present ‘virtual’
`linked list structures to the processor. When the processor accesses the start of a
`row or column linked list, a prefetch for the entire row or column is initiated.
`Because the target location in the cache for the linked list is known, instead of
`returning the actual pointer to the first element, the translate logic returns an
`address in the reserved address space corresponding to the location of the first
`element in the explicitly managed cache region. In addition, when the processor
`accesses the next pointer field, the request is also detected by the translate logic,
`
`8
`
`Patent Owner Saint Regis Mohawk Tribe
`Ex. 2022, p. 8
`
`
`
`
`
`Exhibit B-4: “Architectural Adaptation for Application-Specific Locality Optimizations,” Xingbin Zhang et al., International
`Conference on Computer Design VLSI in Computers and Processors, 1997 (“Zhang”)
`
`and an address is synthesized dynamically to access the next element in this cache
`region.”)
`
`’867 patent claim 3
`
`Zhang
`
`The reconfigurable processor of claim 1,
`wherein the data prefetch unit receives
`processed data from on-processor memory
`and writes the processed data to an external
`off-processor memory.
`
`Zhang discloses the data prefetch unit receives processed data from on-processor
`memory and writes the processed data to an external off-processor memory.
`
`In addition, or alternatively, this limitation is disclosed in at least the following
`references (see corresponding chart for disclosure in each reference): Galbi, Phelps,
`Lange, Douglass, Paulraj, Mitra, and Baxter. At least for the reasons articulated in
`the accompanying pleading, one skilled in the art would have been motivated to
`
`
`
`9
`
`Patent Owner Saint Regis Mohawk Tribe
`Ex. 2022, p. 9
`
`
`
`
`
`Exhibit B-4: “Architectural Adaptation for Application-Specific Locality Optimizations,” Xingbin Zhang et al., International
`Conference on Computer Design VLSI in Computers and Processors, 1997 (“Zhang”)
`
`combine each of these references with the present reference at the time the asserted
`patent was filed.
`
`’867 patent claim 4
`
`Zhang
`
`The reconfigurable processor of claim 1,
`wherein the data prefetch unit comprises at
`least one register from the reconfigurable
`processor.
`
`Zhang discloses the data prefetch unit comprises at least one register from the
`reconfigurable processor.
`
`In addition, or alternatively, this limitation is disclosed in at least the following
`references (see corresponding chart for disclosure in each reference): Phelps,
`Lange, Douglass, Paulraj, and Baxter. At least for the reasons articulated in the
`accompanying pleading, one skilled in the art would have been motivated to
`combine each of these references with the present reference at the time the asserted
`patent was filed.
`
`
`
`10
`
`Patent Owner Saint Regis Mohawk Tribe
`Ex. 2022, p. 10
`
`