throbber
UNITED STATES PATENT AND TRADEMARK OFFICE
`
`
`
`
`
`
`
`BEFORE THE PATENT TRIAL AND APPEAL BOARD
`
`
`
`
`
`
`
`INTEL CORPORATION,
`Petitioner,
`v.
`FG SRC LLC,
`Patent Owner.
`
`
`
`
`
`
`
`IPR2020-01449
`Patent No. 7,149,867
`
`
`
`
`
`
`
`
`
`
`
`
`
`PATENT OWNER FG SRC LLC’S RESPONSE TO
`PETITION FOR INTER PARTES REVIEW OF U.S. PATENT NO. 7,149,867
`
`

`

`TABLE OF CONTENTS
`
`I.
`
`II.
`
`INTRODUCTION ........................................................................................... 1
`
`RELATED PROCEEDINGS .......................................................................... 1
`
`III. SRC BACKGROUND .................................................................................... 1
`
`IV. TECHNOLOGY BACKGROUND ................................................................ 2
`
`A.
`
`Reconfigurable Processors and FPGAs ................................................ 2
`
`B. Memory Hierarchies ............................................................................. 3
`
`C.
`
`Prefetching ............................................................................................ 4
`
`V.
`
`THE ’867 PATENT ......................................................................................... 5
`
`A.
`
`B.
`
`C.
`
`The Invention Of The ’867 Patent. ....................................................... 6
`
`Prefetching ............................................................................................ 9
`
`The ’867 Patent Discloses The Exact Technology Of Chien,
`Zhang, And Gupta In Its “Relevant Background” Discussion. .......... 10
`
`VI. THE ASSERTED PRIOR ART REFERENCES .......................................... 12
`
`A.
`
`B.
`
`Chien (Ex. 1005) ................................................................................. 13
`
`Zhang (Ex. 1003) ................................................................................ 15
`
`C. Gupta (Ex. 1004) ................................................................................. 18
`
`VII. PETITIONER HAS FAILED TO MEET ITS BURDEN IN
`ESTABLISHING ZHANG, GUPTA, AND CHIEN AS PRINTED
`PUBLICATIONS. ......................................................................................... 19
`
`A.
`
`B.
`
`Legal Standards for Establishing Printed Publication. ....................... 19
`
`Petitioner Has Not Met The Standards For Establishing Zhang,
`Gupta and Chien as Printed Publications. .......................................... 21
`
`1.
`2.
`3.
`4.
`
`Petitioner’s Conference Distribution Theory ........................... 22
`Petitioner’s IEEE Xplore Website Theory. .............................. 24
`Petitioner’s Online Library Records Theory. ........................... 25
`Each of Petitioner’s Grounds Therefore Fails. ......................... 29
`ii
`
`

`

`VIII. PATENT OWNER’S CLAIM CONSTRUCTIONS .................................... 29
`
`A. Agreed Terms ...................................................................................... 29
`
`B.
`
`Terms to be construed. ........................................................................ 31
`
`1.
`
`2.
`
`“retrieves only computational data required by the
`algorithm from a second memory… and places the
`retrieved computational data in the first memory” ................... 31
`“read and write only data required for computations by
`the algorithm between the data prefetch unit and the
`common memory” .................................................................... 33
`
`IX. PETITIONER HAS FAILED TO DEMONSTRATE A
`REASONABLE LIKELIHOOD OF PREVAILING AS TO ANY
`CHALLENGED CLAIM. ............................................................................. 33
`
`A. Ground 1: Claims 1-2, 4-8 And 13-19 Are Not Obvious Over
`Zhang And Gupta. ............................................................................... 33
`
`1.
`
`2.
`
`3.
`
`4.
`
`5.
`
`6.
`
`7.
`
`8.
`
`The combination does not render obvious a
`“reconfigurable processor that instantiates an algorithm
`as hardware.” ............................................................................ 33
`The combination does not render obvious a “data prefetch
`unit.” ......................................................................................... 38
`The combination does not render obvious a data prefetch
`unit “wherein the data prefetch unit retrieves only
`computational data required by the algorithm.” ....................... 40
`The combination does not render obvious a first memory
`and a data prefetch unit “wherein at least the first memory
`and data prefetch unit are configured to conform to needs
`of the algorithm.” ...................................................................... 44
`The combination does not render obvious a data prefetch
`unit “configured to match format and location of data in
`the second memory.” ................................................................ 49
`The combination does not render obvious a memory
`controller that “transmits only portions of data desired by
`the data prefetch unit and discards other portions of data
`prior to transmission of the data to the data prefetch unit.” ..... 50
`The combination does not render obvious a
`“reconfigurable processor” as required by claim 13. ............... 51
`The combination does not render obvious a
`reconfigurable processor “wherein the computational unit
`iii
`
`

`

`and the data access unit, and the data prefetch unit are
`configured to conform to needs of an algorithm
`implemented on the computational unit and transfer only
`data necessary for computations by the computational
`unit” as required by claim 13. ................................................... 52
`
`B. Ground 2: Claims 3 And 9-12 Are Not Obvious Over Zhang,
`Gupta, And Chien. .............................................................................. 53
`
`1.
`
`2.
`
`3.
`
`4.
`
`5.
`
`6.
`
`The combination does not render obvious a
`“reconfigurable processor[] that can instantiate an
`algorithm as hardware.” ............................................................ 53
`The combination does not render obvious a “data prefetch
`unit.” ......................................................................................... 54
`The combination does not render obvious “a data prefetch
`unit to read and write only data required for computations
`by the algorithm.” ..................................................................... 54
`The combination does not render obvious a data prefetch
`unit configured to “match format and location of data in
`the common memory.” ............................................................. 55
`The combination does not render obvious a memory
`controller that “transmits to the prefetch unit only data
`desired by the data prefetch unit as required by the
`algorithm.” ................................................................................ 56
`The combination does not render obvious a
`“reconfigurable processor [that] also includes a
`computational unit” as required by claim 11. ........................... 57
`
`X.
`
`SECONDARY CONSIDERATIONS OF NON-OBVIOUSNESS .............. 57
`
`XI. CONCLUSION ............................................................................................. 59
`
`
`
`
`
`
`iv
`
`

`

`CASES:
`
`TABLE OF AUTHORITIES
`
`Acceleration Bay, LLC v. Activision Blizzard, Inc.,
`908 F.3d 765 (Fed. Cir. 2018) .............................................................. 20, 21, 24, 25
`
`Acme Scale Co. v. LTS Scale Co., LLC,
`615 F. App’x 673 (Fed. Cir. 2015) .......................................................................... 14
`
`Elan Pharm., Inc. v. Mayo Found. For Med. Educ. & Research,
`346 F.3d 1051 (Fed. Cir. 2003) ................................................................................ 14
`
`Intelligent Bio-Systems, Inc. v. Illumina Cambridge Ltd.,
`821 F.3d 1359 (Fed. Cir. 2016) ................................................................................ 28
`
`Jazz Pharm., Inc. v. Amneal Pharm., LLC,
`895 F.3d 1347 (Fed. Cir. 2018) ................................................................................ 19
`
`Phillips v. AWH Corp.,
`415 F.3d 1303 (Fed. Cir. 2005) (en banc) ............................................................. 29
`
`Samsung Elec. Co. v. Infobridge Pte. Ltd.,
`929 F.3d 1363 (Fed. Cir. 2019) ................................................................................ 19
`
`SRI Int’l, Inc. v. Internet Sec. Sys., Inc.,
`511 F.3d 1186 (Fed. Cir. 2008) ............................................................................... 19
`
`Voter Verified, Inc. v. Premier Election Solutions, Inc.,
`698 F.3d 1374 (Fed. Cir. 2012) ................................................................................ 24
`
`ADMINISTRATIVE ORDERS:
`
`Hulu, LLC v. Sound View Innovations, LLC,
`Case IPR2018-01039, Paper 29 (PTAB POP Dec. 20, 2019) ................ 19, 27, 28
`
`STATUTES:
`
`35 U.S.C. § 103 ................................................................................................................. 1
`
`v
`
`

`

`
`
`I.
`
`INTRODUCTION
`
`Patent Owner FG SRC LLC (hereinafter “SRC” or “Patent Owner”)
`
`respectfully submits this Patent Owner Response to the Petition for Inter Partes
`
`Review dated August 10, 2020 (“Petition”) of U.S. Patent No. 7,149,867 (Ex. 1001,
`
`“’867 patent”) filed by Intel Corporation (“Intel” or “Petitioner”). Petitioner asserts
`
`that claims 1-19 of the ’867 patent are unpatentable based solely on 35 U.S.C. § 103:
`
`Ground 1 – Claims 1-2, 4-8, 13-19 are unpatentable as obvious over
`
`Zhang in view of Gupta as understood by one of ordinary skill in the art.
`
`Ground 2 – Claims 3 and 9-12 are unpatentable as obvious over Zhang in
`
`view of Gupta and Chien as understood by one of ordinary skill in the art.
`
`This Patent Owner Response is timely filed based on the Parties Joint
`
`Stipulation to Revise Scheduling Order of June 18, 2021. See Paper 32.
`
`II. RELATED PROCEEDINGS
`
`Related Proceedings are listed in Paper 1, at 2.
`
`III. SRC BACKGROUND
`
`Patent Owner’s predecessor, SRC Computers was founded in 1996 by Jon
`
`Huppenthal, Jim Guzy, and Seymore Robert Cray (hence SRC). Ex. 2005, ¶¶ 36-
`
`37; Ex. 2002. Mr. Cray—widely considered to be the father of supercomputing—
`
`designed a series of computers that for decades were the fastest in the world. Ex.
`
`2002; Ex. 2005, ¶ 40. SRC’s patent portfolio is a direct result of this work.
`
`1
`
`

`

`IV. TECHNOLOGY BACKGROUND
`
`A. Reconfigurable Processors and FPGAs
`
`The ’867 patent relates to the use of reconfigurable processors, such as those
`
`made using FPGAs. Ex. 1001, 1:16-24, 5:26-29. An FPGA is a reprogrammable
`
`integrated circuit that contains an array of programmable logic blocks and memory
`
`elements connected via programmable interconnect. Ex. 2028, ¶45, Ex. 2010, ¶ 14.
`
`A user can program an FPGA to perform a specific function by configuring the logic
`
`blocks and interconnect. This enables the user to create a hardware accelerated
`
`implementation of an algorithm by programming the FPGA in a manner that
`
`efficiently executes the algorithm. Id. In other words, with a reconfigurable
`
`processor such as an FPGA, the hardware can adapt to the algorithm. An FPGA is
`
`configured by loading a file called a bitstream into the FPGA. Id. Reconfigurable
`
`processors instantiate hardware to directly perform the required task, and do not rely
`
`on “instructions” the way a general-purpose CPU does. For this reason, the term
`
`“instruction” does not have a plain and ordinary meaning with respect to
`
`reconfigurable processors. Id.
`
`In contrast, a CPU executes an algorithm by performing a sequence of
`
`instructions that implement the algorithm. Id., Ex. 2010, ¶ 15. A different algorithm
`
`can be implemented on the CPU by changing the instruction sequence. Id. The CPU
`
`is flexible; it can implement almost any algorithm. Id. But because the CPU
`
`hardware is fixed, it cannot be customized towards the needs of any particular
`
`2
`
`

`

`algorithm like an FPGA. Id. These customizations allow FPGA implementations
`
`to be orders of magnitude more efficient than implementing that algorithm as
`
`software on a CPU. Id.
`
`B. Memory Hierarchies
`
`Computing systems including CPUs and FPGAs typically employ a memory
`
`hierarchy, which combines different types of memories in an attempt to ensure that
`
`data required for computation is available as needed. Id., ¶ 18. There is a general
`
`trade-off between memory size and bandwidth. Id. In general, larger memories have
`
`lower bandwidth, i.e., they can store a lot of data but the rate at which they can
`
`transfer this data (bits/second) is low. Id. Smaller memories have much higher
`
`bandwidth. Id. Thus, memory systems commonly use hierarchies of progressively
`
`faster (higher bandwidth) but smaller size. Id.
`
`The ’867 patent discusses memory throughout the claims and specification.
`
`For example, Claim 1 recites moving data from a “second memory” to a “first
`
`memory” within a memory hierarchy. This is akin to the concepts described above,
`
`in which data can be moved from slower, larger memory (e.g., a second memory) to
`
`quicker, smaller memory (e.g., a first memory). Id., ¶¶ 20-21.
`
`The invention of the ’867 patent specifically “relates to implementing explicit
`
`memory hierarchies in reconfigurable processors that make efficient use of off-
`
`board, on-board, on-chip storage and available algorithm locality. These explicit
`
`3
`
`

`

`memory hierarchies avoid many of the tradeoffs and complexities found in the
`
`traditional [implicit] memory hierarchies of microprocessors.” Ex. 1001, 1:18-24.
`
`Implicit memory devices encompass a family of processing elements that are
`
`all implicitly controlled and typically are made up of fixed logic that is not altered
`
`by the user. These devices execute software-directed instructions on a step-by-step
`
`basis in fixed logic having predetermined interconnections and functionality. Ex.
`
`2028, ¶ 112.
`
`Explicit memory devices, on the other hand, are Direct Execution Logic
`
`(DEL) and comprise a family of components that is explicitly controlled and is
`
`typically reconfigurable. This set of elements enables a program to establish an
`
`optimized interconnection among the selected functional units in order to implement
`
`a desired computational, pre-fetch and/or data access, functionality for maximizing
`
`the parallelism inherent in the particular code. Id., ¶ 29.
`
`C.
`
`Prefetching
`
`A simple (unoptimized) memory system would have a processor that requests
`
`data when it is required for computation. Ex. 2010, ¶ 22. This can be problematic
`
`especially if the data resides in off-chip memory, which has a large latency or large
`
`number of cycles (e.g., hundreds or more) to retrieve the data. Id. This requires the
`
`computational unit to stall or wait while the data is being loaded. Id. This problem
`
`is addressed by a “prefetch unit”, which fetches needed data before it is needed by
`
`the processor. EX. 2028, ¶47.
`
`4
`
`

`

`Prefetching initiates a request for data before that data is required. In an ideal
`
`case, the prefetch data arrives no later than when it is required. Ex. 2028, ¶47, Ex.,
`
`¶ 26. Generally speaking, there are two ways of prefetching data: 1) dynamically
`
`and 2) statically. Id. Dynamic prefetching attempts to guess what future data is
`
`required by looking at past data access requests. Id. For example, a dynamic
`
`prefetch unit may see a request for some data and prefetch the next N data elements
`
`located spatially nearby to the initial data (with the hopes that the algorithm will
`
`request this data in the future). Id. Static prefetching techniques insert explicit
`
`prefetch instructions into the computer system, e.g., a compiler will analyze the
`
`algorithm and insert prefetch data fetches before the data is computed upon. Id.
`
`There are many types of prefetching techniques, and customizing the prefetching
`
`technique to the algorithm can provide significant overall performance benefits. Id.
`
`V. THE ’867 PATENT
`
`The field of the invention of the ’867 patent is reconfigurable hardware. Ex.
`
`1001, 1:16-18. “More specifically, the invention relates to implementing explicit
`
`memory hierarchies in reconfigurable processors that make efficient use of off-
`
`board, on-board, on-chip storage and available algorithm locality. These explicit
`
`memory hierarchies avoid many of the tradeoffs and complexities found in the
`
`traditional [implicit] memory hierarchies of microprocessors.”
`
`5
`
`

`

`A. The Invention Of The ’867 Patent.
`
`To improve upon the limitations of the prior art, the ’867 patent discloses a
`
`flexible yet efficient fully reconfigurable hardware system consisting of
`
`computational units, data prefetch units, data access units, and memory. Ex. 1001,
`
`Abstract. The parts are fully reconfigurable, meaning they can be configured as
`
`needed for a particular algorithm. Id. Once properly configured, the data prefetch
`
`unit retrieves data from a memory and supplies the data through a data access unit
`
`to the computational units in a way optimally adapted to the needs of the particular
`
`algorithm. Id.
`
`Computer programs are a collection of algorithms that interact to implement
`
`desired functionality. Ex. 1001, 6:32-34. In the prior art, use of static computing
`
`hardware resources (e.g., a conventional microprocessor), required the computer
`
`program to be adapted to run on the particular hardware platform. Ex. 1001, 6:40-
`
`42. “In this manner, the computer program is adapted to conform to the limitations
`
`of the static hardware platform.” Id., 6:42-43.
`
`The ’867 patent effectively flips the paradigm and allows software written in
`
`a human readable high-level language to be compiled into direct execution logic
`
`(DEL). The reconfigurable processor is then configured with the DEL. Ex. 1001,
`
`6:52-54. “In this manner, the hardware resources are essentially adapted to conform
`
`to the program rather than the program being adapted to conform to the hardware
`
`resources.” Ex. 1001, 6:54-57.
`
`6
`
`

`

`Figure 1 represents a reconfigurable processor (RP):
`
`
`
`Ex. 1001, Fig. 1.
`
`The ’867 patent further recognized that “[h]igh memory bandwidth efficiency
`
`is achieved when only data required for computation is moved within the memory
`
`hierarchy.” Ex. 1001, 7:23-25. The ’867 patent further teaches use of a data prefetch
`
`unit that “operates independently of other functional units . . . . This independence
`
`of operation permits hiding the latency associated with obtaining data for use in
`
`computation.” Ex. 1001, 7:36:42. A RP using a reconfigurable data prefetch unit is
`
`illustrated in figure 4:
`
`7
`
`

`

`
`Ex. 1001, Fig. 4 (emphasis added). The data prefetch unit is reconfigurable and can
`
`thus be adapted to the particular needs of a given program:
`
`An important feature of the present invention is that many types of
`data prefetch units can be defined so that the prefetch hardware can be
`configured to conform to the needs of the algorithms currently
`implemented by the computational logic. The specific characteristics
`of the prefetch can be matched with the needs of the computational
`logic and the format and location of data in the memory hierarchy.
`
`Ex. 1001, 7:49:55. Similarly, the memory hierarchy is reconfigurable and can also
`
`be adapted to the particular needs of a given program:
`
`Unlike conventional static hardware platforms, however, the memory
`hierarchy provided in a RP 100 is reconfigurable. In accordance with
`the present invention, through the use of data access units and
`
`8
`
`

`

`associated memory hierarchy components, computational demands
`and memory bandwidth can be matched.
`
`Ex. 1001, 7:17:22 (emphasis added). The advantages achieved by the paradigm shift
`
`of the ’867 patent are tremendous and enable a programmable memory mechanism
`
`with up to 100% bandwidth efficiency and 100% bandwidth utilization:
`
`The scaleable, programmable memory mechanisms enabled by the
`present invention are available to exploit available algorithm locality
`and thereby achieve up to 100% bandwidth efficiency. In addition,
`the scaleable computational resources can be leveraged to attain
`100% bandwidth utilization.
`
`Ex. 1001, 12:18-29 (emphasis added). The ’867 patent was truly a pioneer patent,
`
`well ahead of its time.
`
`B.
`
`Prefetching
`
`Prefetching is a key concept in the patent as every claim requires a “data
`
`prefetch unit” which prefetches data from a second or common memory and
`
`“operates independent of and in parallel with logic blocks using the [computational
`
`data].” The purpose of prefetching data is to ensure that it is available when it is
`
`needed in order to reduce latency. Ex. 2028, ¶67. Thus, the data prefetch unit must
`
`be configured to know in advance what data to prefetch. Id.
`
`The data prefetch unit specifically seeks to reduce the overhead involved in
`
`prefetching data by avoiding transferring unnecessary data between memories, i.e.,
`
`the prefetch unit copies only the data which are to be used in upcoming
`
`computations. E.g., Ex. 1001, Claim 1. The patent is clear in that the data
`
`prefetching unit moves computational data between two memories in a memory
`
`9
`
`

`

`hierarchy. E.g., Ex. 1001, Claim 1. The data prefetch unit “conforms to the needs
`
`of the algorithm” to improve the performance of reconfigurable processor and
`
`overall computing system.
`
`C. The ’867 Patent Discloses The Exact Technology Of Chien, Zhang,
`And Gupta In Its “Relevant Background” Discussion.
`
`Chien, Zhang and Gupta present nothing more than prior art that was
`
`overcome in the prosecution of the ’867 patent. Ex. 2028, ¶51. The ’867 patent
`
`particularly focused on the challenge of designing memory hierarchies that could
`
`keep up with ever increasing processor speeds. Ex. 1001, 1:32-35.
`
`Two measures of the gap between the microprocessor and memory
`hierarchy are bandwidth efficiency and bandwidth utilization.
`Bandwidth efficiency refers to the ability to exploit available locality
`in a program or algorithm. In the ideal situation, when there is
`maximum bandwidth efficiency, all available locality is utilized.
`Bandwidth utilization refers to the amount of memory bandwidth
`that is utilized during a calculation. Maximum bandwidth utilization
`occurs when all available memory bandwidth is utilized.
`
`Ex. 1001, 1:34-39 (emphasis added). “There has been significant effort spent on the
`
`development of memory hierarchies that can maintain high bandwidth efficiency
`
`and utilization with faster microprocessors.” Id. This is the area that Chien, Zhang,
`
`and Gupta address—the customization of cache algorithms for microprocessor use.
`
`Ex. 2028, at ¶¶ 36, 73. Reconfigurable logic is contemplated only in small portions
`
`of the cache architecture. Id.
`
`At the relevant time, cache design for multipurpose microprocessors capable
`
`of executing a wide variety of programs was particularly challenging in the relevant
`
`10
`
`

`

`art. Ex. 1001, 3:30-32. Cache designers tried to derive the behavior of an “average”
`
`program constructed from several actual programs that ran on the microprocessor.
`
`Id., 3:32-35. This resulted in the cache being optimized for the average program,
`
`but no actual program behaves exactly like the average program. Id., 3:35-36. The
`
`inventors of the ’867 patent thus recognized the need for memory hierarchies that
`
`have data storage and retrieval characteristics that are optimized for actual programs
`
`executed by a processor. Ex. 1001, 3:49-51.
`
`Indeed, the background section of the ’867 patent discusses the exact type of
`
`work done in Chien, Zhang, and Gupta: adjustments to the width of the cache line
`
`in a conventional microprocessor architecture. Ex. 2028, ¶121. In all three cited
`
`references, the cache line width could be changed to be either 32 or 64 bytes. Ex.
`
`2028, at ¶¶62, 83, 86, 121. The ’867 patent discloses the exact variations of cache
`
`architecture that Chien, Zhang, and Gupta attempted:
`
`Caches designed with wide cache lines perform well with programs
`that have a high degree of spatial locality, but generally have poor
`gather/scatter performance. Likewise, caches with short cache lines
`have good gather/scatter performance, but loose efficiency executing
`programs with high spatial locality because of the additional runs to
`the main memory.
`
`Ex. 1001, 2:1-13. These kinds of tweaks are implemented in Chien, for example,
`
`which explores limited application-driven customizability, specifically regarding
`
`“optimizing cache granularity for performance.” Ex. 1005, pg. 13. The ’867 patent
`
`further discusses the technology used in the second case study in Zhang: “Newer
`
`cache designs reduce the frequency of cache misses by trying to predict in advance
`11
`
`

`

`the data that the microprocessor will request.” Ex. 1001, 3:16-18. This is exactly
`
`what Zhang’s second example describes: a “predictive” prefetch of a matrix element
`
`that will “likely” be used in the future. Ex. 2028, ¶123. As this is performed for a
`
`cache used by an algorithm running on a microprocessor (or CPU), this is
`
`definitively what is described in the ’867 patent as “newer cache designs” in the
`
`microprocessor realm. Id. Unlike Intel’s three references, the ’867 patent improves
`
`upon that prior art by using a fully reconfigurable approach, including a
`
`reconfigurable first memory able to match the requirements of the data being cached.
`
`Id, ¶124. In Zhang, for example, each matrix element requires 40 bytes of storage,
`
`but the centralized L1 cache is only configurable to either 32 or 64 bytes. The ’867
`
`patent, on the other hand, teaches reconfiguring the first memory (L1 cache in
`
`Zhang) as multiple FIFO streams of the required width and depth in close proximity
`
`to multiple computational units performing matrix multiplication calculations in
`
`parallel, one whole row/column at a time. Id., ¶¶88, 121.
`
`VI. THE ASSERTED PRIOR ART REFERENCES
`
`The three prior art references cited in Intel’s petition are related. The initial
`
`concept work is described in Chien. Zhang builds on this concept work by
`
`performing simulation studies for one of the architectural adaptations envisioned in
`
`Chien. Ex. 2028, ¶98. Gupta then builds on this work and proposes a prototype
`
`board for the type of architectural adaptation simulated in Zhang. Id., ¶100.
`
`12
`
`

`

`A. Chien (Ex. 1005)
`
`Chien’s Multiprocessor with Reconfigurable Parallel Hardware (MORPH)
`
`concept is a theoretical discussion of a flexible 100 TeraOp architecture with 8192
`
`processing nodes and memory elements embedded in a scalable interconnect. Ex.
`
`1005, at 7, 10. Each of these processing nodes also includes cache memory (L1, L2)
`
`that can be shared among CPUs. Id. Chien recognizes the benefits of reconfigurable
`
`logic to an extent, but nevertheless, the use of reconfigurable FPGA logic in
`
`MORPH is intended only for small local blocks of customization. Id., at 7. Indeed,
`
`MORPH teaches away from the use of reconfigurable logic for application-specific
`
`functional units or computational logic. Id., at 9 (“MORPH’s design exploits small
`
`blocks of reprogrammable interconnect logic to achieve application customizability
`
`rather than application-specific functional units or compute elements.” (emphasis
`
`added)). Unlike the ’867 patent, Chien therefore relies on conventional CPUs (i.e.,
`
`non-reconfigurable CPUs) which are not reconfigurable. The MORPH concept only
`
`envisions reconfigurable logic in a few places—such as the cache memory
`
`subsystem. Id., at 10 (“Our solution is a machine architecture which leverages a
`
`small amount of programmable logic in several key places to implement a flexible
`
`hardware composition structure.”). In Chien, the algorithm is not instantiated as
`
`hardware (unlike a reconfigurable processor). The only contemplated use of
`
`reconfigurable logic in Chien (as well as in the follow-up work of Zhang and Gupta)
`
`is to implement reconfigurable logic in “support” algorithms, making the actual
`
`13
`
`

`

`topics of research easier to vary—to increase the ease of researching several
`
`configurations of cache optimization techniques for microprocessors on the same
`
`hardware.
`
`Although Chien offers broad areas of research possibilities, it does not provide
`
`enabling detail of any of these. Chien was written “in the early stages of design,
`
`[such that] many of the detailed design issues [were] intentionally left open.” Id.,
`
`at 8 (emphasis added). Since the theory of Chien cannot be reproduced without
`
`undue experimentation, it is not an enabling prior art reference. Elan Pharm., Inc.
`
`v. Mayo Found. For Med. Educ. & Research, 346 F.3d 1051, 1054, 68 USPQ2d
`
`1373, 1376 (Fed. Cir. 2003); Acme Scale Co. v. LTS Scale Co., LLC, 615 F. App’x
`
`673, 678 (Fed. Cir. 2015); see also Ex. 2028, ¶ 39.
`
`Most relevant to the technology of the ’867 patent is Chien’s discussion of
`
`application-driven customizability, specifically regarding “optimizing cache
`
`granularity for performance.” Ex. 1005, at 13. However, Chien’s proposed cache
`
`optimizations differ significantly from those of the ’867 patent. Specifically, the
`
`MORPH concept is limited to proposing changes in cache sizes (L1, L2) or
`
`expanding victim cache buffers. Id., at 13. However, once reconfigured for the
`
`optimal size for the application, Chien’s cache works in a conventional way. Id.
`
`When new data is required in the cache, the whole cache row is read. Id. Chien
`
`does not discuss retrieving only computational data required by the particular
`
`application, nor adapting a memory to the data requirements of the algorithm. The
`
`14
`
`

`

`institution decision relies on Zhang’s disclosure that it “aims to send only used fields
`
`of matrix elements during a given computation” and that “transferring all data (e.g.,
`
`non-zero matrix elements) because all data is needed or required effectively
`
`retrieves only computational data required by the algorithm.” Paper 13 at 56, 57.
`
`However, Dr. Mangione Smith confirms that even the optimized storage scheme
`
`does not disclose loading “only data required for the algorithm” because the very
`
`last cache row is highly unlikely to match the exact size of the data, yet is always
`
`transferred as a whole. EX2028, ¶74. The ’867 patent, on the other hand, teaches
`
`reconfiguring the first memory itself to “conform to the needs of the algorithm.” Ex.
`
`1001, 12:51-52.
`
`B.
`
`Zhang (Ex. 1003)
`
`Zhang builds on Chien’s theoretical discussion of integrating small blocks of
`
`programmable logic into key elements of the MORPH architecture and provides a
`
`concept drawing for his general idea. Ex. 1003, pgs. 13-14. Like Chien, Zhang is
`
`not an enabling disclosure because its simulation is not sufficient to enable one of
`
`ordinary skill in the art to build a working hardware system without undue
`
`experimentation. Ex. 2028, ¶ 62. Zhang uses programmable logic (FPGA) as means
`
`to deliver data for use by the CPU. The goal of the “small blocks of programmable
`
`logic implemented into key elements of a baseline architecture” is movement of data
`
`between memory hierarchies to reduce latency at the point of data consumption.
`
`Unlike the ’867 patent however, the final consumer is a conventional CPU, not a
`
`15
`
`

`

`reconfigurable processor. Zhang explicitly mentions that the main application
`
`remains in software (executed on the CPU) and programmable logic is used only for
`
`hardware adaptations (that remain static for the duration of a specific application
`
`run), thus teaching away from the invention of the ’867 patent. Ex. 1003, at 14. The
`
`institution decision argues that Zhang merely does not disclose the invention, as
`
`opposed to teaching away from it. Paper 13 at 45. However, Dr. Mangione-Smith
`
`explains and confirms that Zhang explicitly teaches away from implementing any
`
`part of the application itself or any of the algorithms that comprise the application in
`
`hardware. EX. 2028, ¶70.
`
`Zhang’s examples of prefetching technology are exactly what is described in
`
`the background of the ’867 patent: “Newer cache designs reduce the frequency of
`
`cache misses by trying to predict in advance the data that the microprocessor will
`
`request.” Ex. 1001, 3:16-18. The architectural hardware adaptations made possible
`
`by incorporating only the small amount of programmable logic in the cache
`
`management enables easier switching between known cache management
`
`techniques and related broadening of research possibilities with a single researc

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket