`
`Reference 25
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2137, p. 1
`
`
`
`THE THEORY AND
`
`PRACTICE OF
`
`FPGA-BASED
`
`COMPUTATION
`
`RECONFIGURABLE
`
`COMPUTING
`
`SYSTEMS
`
`ON
`
`SILICON
`
`••
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2137, p. 2
`
`
`
`RECONFIGURABLE
`COMPUTING
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2137, p. 3
`
`
`
`-
`
`The Morgan Kaufmann Series in Systems on Silicon
`Series Editor: Wayne Wolf, Georgia Institute of Technology
`
`The Designer's Guide to VHDL, Second Edition
`Peter J. Ashenden
`
`The System Designer's Guide to VHDL-AMS
`Peter J. Ashenden, Gregory D. Peterson, and Darrell A. Teegarden
`
`Modeling Embedded Systems and SoCs
`Axel Jantsch
`
`ASIC and FPGA Verification: A Guide to Component Modeling
`Richard Munden
`
`Multiprocessor Systems-on-Chips
`Edited by Ahmed Amine Jerraya and Wayne Wolf
`
`Functional Verification
`Bruce Wile, John Goss, and Wolfgang Roesner
`
`Customizable and Configurable Embedded Processors
`Edited by Paolo Ienne and Rainer Leupers
`
`Networks-on-Chips: Technology and Tools
`Edited by Giovanni De Micheli and Luca Benini
`
`VLSI Test Principles & Architectures
`Edited by Laung-Temg Wang, Cheng-Wen Wu, and Xiaoqing Wen
`
`Designing SoCs with Configured Processors
`Steve Leibson
`
`ESL Design and Verification
`Grant Martin, Andrew Piziali, and Brian Bailey
`
`Aspect-Oriented Programming with e
`David Robinson
`
`Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation
`Edited by Scott Hauck and Andre DeHon
`
`Coming Soon . . .
`
`System-on-Chip Test Architectures
`Edited by Laung-Temg Wang, Charles Stroud, and Nur Touba
`
`Verification Techniques for System-Level Design
`Masahiro Fujita, Indradeep Ghosh, and Mukul Prasad
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2137, p. 4
`
`
`
`I
`
`RECONFIGURABLE
`COMPUTING
`THE THEORY AND PRACTICE
`OFFPGA-BASED COMPUTATION
`
`Edited by
`
`Scott Hauck and Andre DeHon
`
`AMSTERDAM. BOSTON. HEIDELBERG. LONDON M � ◄
`
`NEW DELHI • NEW YORK• OXFORD • PARIS • SAN DIEGO
`
`-.,
`
`ELSEVIER
`
`SAN FRANCISCO • SINGAPORE• SYDNEY• TOKYO
`Morgan Kaufmann Publishers is an imprint of Elsevier
`
`M O R G A N
`
`KAUFMANN
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2137, p. 5
`
`
`
`Reconfigurable Computing
`Hauck and DeHon
`
`MORGAN KAUFMANN PUBLISHERS
`An imprint of Elsevier
`30 Corporate Drive, Suite 400, Burlington, MA 01803-4255
`
`Copyright © 2008 by Elsevier Inc.
`
`I Original ISBN: 978-0-12-370522-8 I
`All rights reserved.
`
`No part of this publication may be reproduced or transmitted in any form or
`by any means-electronic or mechanical, including photocopy, recording, or
`any information storage and retrieval system-without permission in writing
`from the publisher.
`
`First Printed in India 2011
`
`Indian Reprint ISBN: 978-93-80931-86-9
`
`This edition has been authorized by Elsevier for sale in the following countries:
`India, Pakistan, Nepal, Sri Lanka and Bangladesh. Sale and purchase of this book
`outside these countries is not authorized and is illegal.
`
`Published by Elsevier, a division of Reed Elsevier India Private Limited.
`
`Registered Office: 622, Indraprakash Building, 21 Barakhamba Road,
`New Delhi-110 001.
`Corporate Office: 14th floor, Building No. l0B, DLF Cyber City Phase-II, Gurgaon-
`122 002, Haryana, India.
`
`Printed and bound in India by Sanat Printers, Kundli-131 028
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2137, p. 6
`
`
`
`List of Contributors
`
`xxi
`
`Thomas W. Fry, Samsung, Global Strategy Group, Seoul, South Korea
`(Chapter 27)
`Maya B. Gokhale, Lawrence Livermore National Laboratory, Livermore,
`California (Chapter 10)
`Steven A. Guccione, Cmpware, Inc., Austin, Texas (Chapters 3 and 19)
`Scott Hauck, Department of Electrical Engineering, University of Washington,
`Seattle, Washington (Chapters 20 and 27)
`K. Scott Hemmert, Computation, Computers, Information and Mathematics
`Center, Sandia National Laboratories, Albuquerque, New Mexico
`( Chapter 31)
`Randy Huang, Tabula, Inc., Santa Clara, California (Chapter 9)
`Brad L. Hutchings, Department of Electrical and Computer Engineering,
`Brigham Young University, Provo, Utah (Chapters 12 and 21)
`Nachiket Kapre, Department of Computer Science, California Institute of
`Technology, Pasadena, California (Chapter 6)
`Andreas Koch, Department of Computer Science, Embedded Systems and
`Applications Group, Technische Universitat of Darmstadt, Darmstadt,
`Germany (Chapter 15)
`Miriam Leeser, Department of Electrical and Computer Engineering,
`Northeastern University, Boston, Massachusetts (Chapter 32)
`John W. Lockwood, Department of Computer Science and Engineering,
`Washington University in St. Louis, St. Louis, Missouri; and Department
`of Electrical Engineering, Stanford University, Stanford, California
`(Chapter 34)
`Wayne Luk, Department of Computing, Imperial College, London,
`United Kingdom (Chapter 22)
`Sharad Malik, Department of Electrical Engineering, Princeton University,
`Princeton, New Jersey (Chapter 29)
`Yury Markovskiy, Department of Electrical Engineering and Computer
`Sciences, University of California-Berkeley, Berkeley, California (Chapter 9)
`Margaret Martonosi, Department of Electrical Engineering, Princeton
`University, Princeton, New Jersey (Chapter 29)
`Larry McMurchie, Synplicity Corporation, Sunnyvale, California (Chapter 17)
`Brent E. Nelson, Department of Electrical and Computer Engineering,
`Brigham Young University, Provo, Utah (Chapters 12 and 21)
`Peichen Pan, Magma Design Automation, Inc., San Jose, California
`(Chapter 13)
`Oliver Pell, Department of Computing, Imperial College, London, United
`Kingdom ( Chapter 22)
`Stylianos Perissakis, Department of Electrical Engineering and Computer
`Sciences, University of California-Berkeley, Berkeley, California (Chapter 9)
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2137, p. 7
`
`
`
`xxii
`
`List of Contributors
`
`Laura Pozzi, Faculty of Informatics, University of Lugano, Lugano,
`Switzerland (Chapter 9)
`Brian C. Richards, Department of Electrical Engineering and Computer
`Sciences, University of California-Berkeley, Berkeley, California (Chapter 8)
`Eduardo Sanchez, School of Computer and Communication Sciences, Ecole
`Polytechnique Federale de Lausanne; and Reconfigurable and Embedded
`Digital Systems Institute, Haute Ecole d'Ingenierie et de Gestion du Canton
`de Vaud, Lausanne, Switzerland (Chapter 33)
`Lesley Shannon, School of Engineering Science, Simon Fraser University,
`Burnaby, BC, Canada (Chapter 2)
`Satnam Singh, Programming Principles and Tools Group, Microsoft Research,
`Cambridge, United Kingdom (Chapter 16)
`Greg Stitt, Department of Computer Science and Engineering, University of
`California-Riverside, Riverside, California (Chapter 26)
`Russell Tessier, Department of Computer and Electrical Engineering,
`University of Massachusetts, Amherst, Massachusetts (Chapter 30)
`Keith D. Undeiwood, Computation, Computers, Information and
`Mathematics Center, Sandia National Laboratories, Albuquerque, New
`Mexico (Chapter 31)
`Andres Upegui, Logic Systems Laboratory, School of Computer and
`Communication Sciences, Ecole Polytechnique Federale de Lausanne,
`Lausanne, Switzerland (Chapter 33)
`Frank Vahid, Department of Computer Science and Engineering, University of
`California-Riverside, Riverside, California (Chapter 26)
`John Wawrzynek, Department of Electrical Engineering and Computer
`Sciences, University of California-Berkeley, Berkeley, California (Chapters 8
`and 9)
`Nicholas Weaver, International Computer Science Institute, Berkeley,
`California (Chapter 18)
`Joseph Yeh, Lincoln Laboratory, Massachusetts Institute of Technology,
`Lexington, Massachusetts (Chapter 9)
`Peixin Zhong, Department of Electrical and Computer Engineering, Michigan
`State University, East Lansing, Michigan (Chapter 29)
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2137, p. 8
`
`
`
`PREFACE
`
`In the two decades since field-programmable gate arrays (FPGAs) were
`introduced, they have radically changed the way digital logic is designed and
`deployed. By marrying the high performance of application-specific integrated
`circuits (ASICs) and the flexibility of microprocessors, FPGAs have made pos
`sible entirely new types of applications. This has helped FPGAs supplant both
`ASICs and digital signal processors (DSPs) in some traditional roles.
`To make the most of this unique combination of performance and flexibility,
`designers need to be aware of both hardware and software issues. Thus, an
`FPGA user must think not only about the gates needed to perform a computation
`but also about the software flow that supports the design process. The goal of
`this book is to help designers become comfortable with these issues, and thus
`be able to exploit the vast opportunities possible with reconfigurable logic.
`We have written Reconfigurable Computing as a tutorial and as a reference
`on the wide range of concepts that designers must understand to make the best
`use of FPGAs and related reconfigurable chips-including FPGA architectures,
`FPGA logic applications, and FPGA CAD tools-and the skills they must have
`for optimizing a computation. It is targeted particularly toward those who view
`FPGAs not just as cheap, slow ASIC gates or as a means of prototyping before
`the "real" hardware is created, but are interested in evaluating or embracing the
`substantial advantages reprogrammable devices offer over other technologies.
`However, readers who focus primarily on ASIC- or CPU-based implementations
`will learn how FPGAs can be a useful addition to their normal skill set. For
`some traditional designers this book may even serve as an entry point into a
`completely new way of handling their design problems.
`Because we focus on both hardware and software systems, we expect readers
`to have a certain level of familiarity with each technology. On the hardware side,
`we assume that readers have a basic knowledge of digital logic design, includ
`ing understanding concepts such as gates (including multiplexers, flip-flops,
`and RAM), binary number systems, and simple logic optimization. Knowledge
`of hardware description languages, such as Verilog or VHDL, is also helpful.
`We also assume that readers have basic knowledge of computer programming,
`including simple data structures and algorithms. In sum, this book is appro
`priate for most readers with a background in electrical engineering, computer
`science, or computer engineering. It can also be used as a text in an upper-level
`undergraduate or introductory graduate course within any of these disciplines.
`No one book can hope to cover every possible aspect of FPGAs exhaustively.
`Entire books could be (and have been) written about each of the concepts that
`are discussed in the individual chapters here. Our goal is to pr,pvjde a good
`working knowledge of these concepts, as well as abundant references for those
`who wish to dig deeper.
`
`.......
`
`'
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2137, p. 9
`
`
`
`xxiv
`
`Preface
`
`Reconfigurable Computing: The Theory and Practice of FPGA-Based Compu
`tation is divided into six major parts-hardware, programming, compilation/
`mapping, application development, case studies, and future trends. Once the
`introduction has been read, the parts can be covered in any order. Alternatively,
`readers can pick and choose which parts they wish to cover. For example, a
`reader who wants to focus on CAD for FPGAs might skip hardware and appli
`cation development, while a reader who is interested mostly in the use of FPGAs
`might focus primarily on application development.
`Part V is made up of self-contained overviews of specific, important appli
`cations, which can be covered in any order or can be sprinkled throughout a
`course syllabus. The part introduction lists the chapters and concepts relevant
`to each case study and so can be used as a guide for the reader or instructor in
`selecting relevant examples.
`One final consideration is an explanation of how this book was written.
`Some books are created by a single author or a set of coauthors who must
`stretch to cover all aspects of a given topic. Alternatively, an edited text can
`bring together contributors from each of the topic areas, typically by bundling
`together standalone research papers. Our book is a bit of a hybrid. It was con
`structed from an overall outline developed by the primary authors, Scott Hauck
`and Andre DeHon. The chapters on the chosen topics were then written by noted
`experts in these areas, and were carefully edited to ensure their integration into
`a cohesive whole. Our hope is that this brings the benefits of both styles of tra
`ditional texts, with the reader learning from the main experts on each topic, yet
`still delivering a well-integrated text.
`
`Acknowledgments
`While Scott and Andre handled the technical editing, this book also benefited
`from the careful help from the team at Elsevier/Morgan Kaufmann. Wayne Wolf
`first proposed the concept of this book to us. Chuck Glaser, ably assisted by
`Michele Cronin and Matthew Cater, was instrumental in resurrecting the project
`after it had languished in the concept stage for several years and in pushing it
`through to completion. Just as important were the efforts of the production
`group at Elsevier/Morgan Kaufmann who did an excellent job of copyediting,
`proofreading, integrating text and graphics, laying out, and all the hundreds
`of little details crucial to bringing a book together into a polished whole. This
`was especially true for a book like this, with such a large list of contributors.
`Specifically, Marilyn E. Rash helped drive the whole production process and
`was supported by Dianne Wood, Jodie Allen, and Steve Rath. Without their help
`there is no way this monumental task ever would have been finished. A big thank
`you to all.
`
`Scott Hauck
`Andre Dellon
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2137, p. 10
`
`
`
`INTRODUCTION
`
`In the computer and electronics world, we are used to two different ways of
`performing computation: hardware and software. Computer hardware, such
`as application-specific integrated circuits (ASICs), provides highly optimized
`resources for quickly performing critical tasks, but it is permanently configured
`to only one application via a multimillion-dollar design and fabrication effort.
`Computer software provides the flexibility to change applications and perform
`a huge number of different tasks, but is orders of magnitude worse than ASIC
`implementations in terms of performance, silicon area efficiency, and power
`usage.
`Field-programmable gate arrays (FPGAs) are truly revolutionary devices that
`blend the benefits of both hardware and software. They implement circuits
`just like hardware, providing huge power, area, and performance benefits over
`software, yet can be reprogrammed cheaply and easily to implement a wide
`range of tasks. Just like computer hardware, FPGAs implement computations
`spatially, simultaneously computing millions of operations in resources dis
`tributed across a silicon chip. Such systems can be hundreds of times faster
`than microprocessor-based designs. However, unlike in ASICs, these computa
`tions are programmed into the chip, not permanently frozen by the manufac
`turing process. This means that an FPGA-based system can be programmed and
`reprogrammed many times.
`Sometimes reprogramming is merely a bug fix to correct faulty behavior, or
`it is used to add a new feature. Other times, it may be carried out to reconfigure
`a generic computation engine for a new task, or even to reconfigure a device
`during operation to allow a single piece of silicon to simultaneously do the work
`of numerous special-purpose chips.
`However, merging the benefits of both hardware and software does come at a
`price. FPGAs provide nearly all of the benefits of software flexibility and devel
`opment models, and nearly all of the benefits of hardware efficiency-but not
`quite. Compared to a microprocessor, these devices are typically several orders
`of magnitude faster and more power efficient, but creating efficient programs for
`them is more complex. Typically, FPGAs are useful only for operations that pro
`cess large streams of data, such as signal processing, networking, and the like.
`Compared to ASICs, they may be 5 to 25 times worse in terms of area delay,
`and performance. However, while an ASIC design may take months to years to
`develop and have a multimillion-dollar price tag, an FPGA design might only
`take days to create and cost tens to hundreds of dollars. For systems that do
`not require the absolute highest achievable performance or power efficiency, an
`FPGA's development simplicity and the ability to easily fix bugs and upgrade
`functionality make them a compelling design alternative. For many tasks, and
`particularly for beginning electronics designers, FPGAs are the ideal choice.
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2137, p. 11
`
`
`
`xxvi
`
`Introduction
`
`FIGURE 1.1 ■ An abstract view of an FPGA; logic cells are embedded in a general routing
`structure.
`
`Figure 1.1 illustrates the internal workings of a field-programmable gate array,
`which is made up of logic blocks embedded in a general routing structure. This
`array of logic gates is the G and A in FPGA. The logic blocks contain process
`ing elements for performing simple combinational logic, as well as flip-flops
`for implementing sequential logic. Because the logic units are often just sim
`ple memories, any Boolean combinational function of perhaps five or six inputs
`can be implemented in each logic block. The general routing structure allows
`arbitrary wiring, so the logical elements can be connected in the desired manner.
`Because of this generality and flexibility, an FPGA can implement very com
`plex circuits. Current devices can compute functions on the order of millions
`of basic gates, running at speeds in the hundreds of Megahertz. To boost speed
`and capacity, additional, special elements can be embedded into the array, such
`as large memories, multipliers, fast-carry logic for arithmetic and logic func
`tions, and even complete microprocessors. With these predefined, fixed-logic
`units, which are fabricated into the silicon, FPGAs are capable of implementing
`complete systems in a single programmable device.
`The logic and routing elements in an FPGA are controlled by programming
`points, which may be based on antifuse, Flash, or SRAM technology. For recon
`figurable computing, SRAM-based FPGAs are the preferred option, and in fact
`are the primary style of FPGA devices in the electronics industry as a whole.
`In these devices, every routing choice and every logic function is controlled by
`a simple memory bit. With all of its memory bits programmed, by way of a
`configuration file or bitstream, an FPGA can be configured to implement the
`user's desired function. Thus, the configuration can be carried out quickly and
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2137, p. 12
`
`
`
`i::
`
`.
`
`•
`
`,
`
`xxviii
`
`Introduction
`
`-----
`
`-
`
`Source Code
`
`Logic Synthesis
`
`Technology Mapping
`
`Placement
`
`Routing
`
`Bitstream Generation
`
`00101011001010
`01001011101010 l 0
`11011100100110 O
`1 O
`1 o O
`00010001111001
`O
`01001110001010 O o O
`O
`00110110010101 1 O 1
`11001010000000 l O 1 O
`1
`11001010001010 O
`1 l
`00110100100110 1
`1 l O
`1 0 0
`11000101010101
`'-rT"T"TT....-,...,..,.."'""rrr,1 O 0
`
`'7'1� ��� 'T� TT� "r!.
`""',£�r!l�O. �
`._ff':c ._
`'f:.
`":,,
`'::;
`
`
`1
`
`1
`
`�;,
`
`Bitstream
`
`FIGURE 1.2 ■ A typical FPGA mapping flow.
`
`.
`"
`.
`.
`··-··· ··-··· ··-····
`I
`I
`t''=i! �•7:'i! �=·,�i
`....,
`I ,
`'::!: : : ::.I= =·'= =. •
`... .Ji .... ..Ji ... .J'!
`I
`(
`I
`,
`I
`r · 1! ; • 1! r · 1ii
`.... ..Ji ... .Ji ... .J�!
`I
`I
`I
`I
`,
`; ·.1!;" 1! :-: 1,i
`I • • ,..
`•
`\
`I
`I
`�:-!! i.£=-'1 �:-!:l
`11 ■ 1!1 ■ 1!1 ■
`
`~ --..
`
`,.,
`
`~
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2137, p. 13
`
`
`
`Introduction
`
`xxvii
`
`without permanent fabrication steps, allowing customization at the user's elec
`tronics bench, or even in the final end product. This is why FPGAs are field
`programmable, and why they differ from mask-programmable devices, which
`have their functionality fixed by masks during fabrication.
`Because customizing an FPGA merely involves storing values to memory loca
`tions, similarly to compiling and then loading a program onto a computer, the
`creation of an FPGA-based circuit is a simple process of creating a bitstream to
`load into the device (see Figure I.2). Although there are tools to do this from soft
`ware languages, schematics, and other formats, FPGA designers typically start
`with an application written in a hardware description language (HDL) such as
`Verilog or VHDL. This abstract design is optimized to fit into the FPGA's avail
`able logic through a series of steps: Logic synthesis converts high-level logic con
`structs and behavioral code into logic gates, followed by technology mapping to
`separate the gates into groupings that best match the FPGA's logic resources.
`Next, placement assigns the logic groupings to specific logic blocks and routing
`determines the interconnect resources that will carry the user's signals. Finally,
`bitstream generation creates a binary file that sets all of the FPGA's program
`ming points to configure the logic blocks and routing resources appropriately.
`After a design has been compiled, we can program the FPGA to perform a
`specified computation simply by loading the bitstream into it. Typically either a
`host microprocessor/microcontroller downloads the bitstream to the device, or
`an EPROM programmed with the bitstream is connected to the FPGA's configu
`ration port. Either way, the appropriate bitstream must be loaded every time the
`FPGA is powered up, as well as any time the user wants to change the circuitry
`when it is running. Once the FPGA is configured, it operates as a custom piece
`of digital logic.
`Because of the FPGA's dual nature-combining the flexibility of software with
`the performance of hardware-an FPGA designer must think differently from
`designers who use other devices. Software developers typically write sequen
`tial programs that exploit a microprocessor's ability to rapidly step through a
`series of instructions. In contrast, a high-quality FPGA design requires think
`ing about spatial parallelism-that is, simultaneously using multiple resources
`spread across a chip to yield a huge amount of computation.
`Hardware designers have an advantage because they already think in terms
`of hardware implementations; even so, the flexibility of FPGAs gives them new
`opportunities generally not available in ASICs and other fixed devices. Field
`programmable gate array designs can be rapidly developed and deployed, and
`even reprogrammed in the field with new functionality. Thus, they do not
`demand the huge design teams and validation efforts required for ASICs. Also,
`the ability to change the configuration, even when the device is running, yields
`new opportunities, such as computations that optimize themselves to specific
`demands on a second-by-second basis, or even time multiplexing a very large
`design onto a much smaller FPGA. However, because FPGAs are noticeably
`slower and have lower capacity than ASICs, designers must carefully optimize
`their design to the target device.
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2137, p. 14
`
`
`
`Introduction
`
`xxix
`
`FPGAs are a very flexible medium, with unique opportunities and challenges.
`The goal of Reconfigurable Computing: The Theory and Practice of FPGA-Based
`Computation is to introduce all facets of FPGA-based systems-both positive
`and problematic. It is organized into six major parts:
`
`■ Part I introduces the hardware devices, covering both generic FPGAs
`and those specifically optimized for reconfigurable computing (Chapters 1
`through 4).
`■ Part II focuses on programming reconfigurable computing systems,
`considering both their programming languages and programming models
`(Chapters 5 through 12).
`■ Part III focuses on the software mapping flow for FPGAs, including each
`of the basic CAD steps of Figure 1.2 (Chapters 13 through 20).
`■ Part IV is devoted to application design, covering ways to make the most
`efficient use of FPGA logic (Chapters 21 through 26). This part can be
`viewed as a finishing school for FPGA designers because it highlights
`ways in which application development on an FPGA is different from
`both software programming and ASIC design.
`■ Part V is a set of case studies that show complete applications of
`reconfigurable logic (Chapters 27 through 35).
`■ Part VI contains more advanced topics, such as theoretical models and
`metric for reconfigurable computing, as well as defect and fault tolerance
`and the possible synergies between reconfigurable computing and
`nanotechnology (Chapters 36 through 38).
`
`As the 38 chapters that follow will show, the challenges that FPGAs present
`are significant. However, the effort entailed in surmounting them is far out
`weighed by the unique opportunities these devices offer to the field of computing
`technology.
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2137, p. 15
`
`
`
`CONTENTS
`
`List of Contributors
`Preface
`Introduction
`
`1.2
`
`1.3
`
`1.4
`
`Part I: Reconfigurable Computing Hardware
`1 Device Architecture
`1.1
`Logic-The Computational Fabric . . . . . . . . . . . . . . . . . . . . .
`Logic Elements . . . . . . . . . . . . . . . . . . . . . . . . . . .
`1. 1. 1
`Programmability . . . . . . . . . . . . . . . . . . . . . . . . . .
`1.1.2
`The Array and Interconnect . . . . . . . . . . . . . . . . . . . . . . . . .
`Interconnect Structures . . . . . . . . . . . . . . . . . . . . . .
`1.2.1
`Programmability . . . . . . . . . . . . . . . . . . . . . . . . . .
`1.2.2
`Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`1.2.3
`Extending Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`Extended Logic Elements . . . . . . . . . . . . . . . . . . . .
`1.3.1
`Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`1.3.2
`Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`SRAM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`1.4.1
`Flash Memory . . . . . . . . . . . . . . . . . . . . . . . . . . .
`1.4.2
`Antifuse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`1.4.3
`Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`1.4.4
`Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`Altera Stratix ............... : . . . . . . . . . . . .
`1.5.1
`Xilinx Vrrtex-II Pro . . . . . . . . . . . . . . . . . . . . . . . .
`1.5.2
`Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`
`1.5
`
`1.6
`
`2 Reconfigurable Computing Architectures
`Reconfigurable Processing Fabric Architectures . . . . . . . . . . . .
`2.1
`Fine-grained . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`2.1.1
`Coarse-grained . . . . . . . . . . . . . . . . . . . . . . . . . . .
`2.1.2
`RPF Integration into Traditional Computing Systems . . . . . . . . .
`2.2.1
`Independent Reconfigurable Coprocessor Architectures . .
`Processor + RPF Architectures . . . . . . . . . . . . . . . . .
`2.2.2
`Summary and Future Work . . . . . . . . . . . . . . . . . . . . . . . . .
`References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`
`2.2
`
`2.3
`
`3 Reconfigurable Computing Systems
`Early Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`3.1
`3.2
`PAM, VCC, and Splash . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`PAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`3.2.1
`Virtual Computer . . . . . . . . . . . . . . . . . . . . . . . . . .
`3.2.2
`Splash . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`3.2.3
`
`xx
`:xxiii
`:xxv
`
`1
`3
`3
`4
`6
`6
`7
`1
`2
`12
`1
`2
`1
`2
`16
`16
`16
`17
`17
`18
`18
`19
`2
`3
`26
`27
`
`29
`30
`30
`3
`2
`3
`5
`36
`40
`4
`4
`4
`5
`
`47
`47
`49
`49
`50
`5
`1
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2137, p. 16
`
`
`
`vi
`
`Contents
`
`3.3
`
`3.4
`
`Small-scale Reconfigurable Systems . . . . . . . . . . . . . . . . . . . .
`3.3.1
`PRISM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`3.3.2
`CAL and XC6200 . . . . . . . . . . . . . . . . . . . . . . . . . .
`3.3.3
`Cloning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`Circuit Emulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`3.4.1
`AMD/Intel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`3.4.2 Virtual Wires . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`Accelerating Technology . . . . . . . . . . . . . . . . . . . . . . . . . . .
`3.5.1
`Teramac . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`Reconfigurable Supercomputing . . . . . . . . . . . . . . . . . . . . . .
`3.6.1
`Cray, SRC, and Silicon Graphics . . . . . . . . . . . . . . . .
`3.6.2
`The CMX-2 X . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`3.7 Non-FPGA Research. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`3.8 Other System Issues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`The Future of Reconfigurable Systems . . . . . . . . . . . . . . . . . .
`3.9
`References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`
`3.5
`
`3.6
`
`52
`53
`53
`54
`54
`55
`56
`56
`57
`59
`60
`60
`61
`61
`62
`63
`
`65
`66
`66
`67
`68
`70
`71
`73
`74
`75
`76
`76
`77
`77
`79
`80
`80
`81
`81
`82
`82
`83
`84
`
`4 Reconfiguration Management
`4.1
`Reconfiguration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`4.2
`Configuration Architectures . . . . . . . . . . . . . . . . . . . . . . . . .
`4.2.1
`Single-context . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`4.2.2
`Multi-context . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`4.2.3
`Partially Reconfigurable . . . . . . . . . . . . . . . . . . . . .
`4.2.4
`Relocation and Defragmentation . . . . . . . . . . . . . . . .
`4.2.5
`Pipeline Reconfigurable . . . . . . . . . . . . . . . . . . . . . .
`4.2.6
`Block Reconfigurable . . . . . . . . . . . . . . . . . . . . . . .
`4.2.7
`Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`4.3 Managing the Reconfiguration Process . . . . . . . . . . . . . . . . . .
`4.3.1
`Configuration Grouping. . . . . . . . . . . . . . . . . . . . . .
`4.3.2
`Configuration Caching . . . . . . . . . . . . . . . . . . . . . .
`4.3.3
`Configuration Scheduling . . . . . . . . . . . . . . . . . . . .
`4.3.4
`Software-based Relocation and Defragmentation . . . . . .
`4.3.5
`Context Switching . . . . . . . . . . . . . . . . . . . . . . . . .
`Reducing Configuration Transfer Time . . . . . . . . . . . . . . . . . .
`4.4.1
`Architectural Approaches. . . . . . . . . . . . . . . . . . . . .
`4.4.2
`Configuration Compression . . . . . . . . . . . . . . . . . . .
`4.4.3
`Configuration Data Reuse . . . . . . . . . . . . . . . . . . . .
`Configuration Security . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`
`4.5
`4.6
`
`4.4
`
`Part II: Programming Reconfigurable Systems
`5 Compute Models and System Architectures
`5.1
`Compute Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`5.1.1
`Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`5.1.2
`Common Primitives . . . . . . . . . . . . . . . . . . . . . . . .
`5.1.3
`Dataflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`5.1.4
`Sequential Control . . . . . . . . . . . . . . . . . . . . . . . . .
`
`87
`
`91
`93
`93
`97
`98
`103
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2137, p. 17
`
`
`
`Contents
`
`vii
`
`5.2
`
`5.1.5
`Data Parallel . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`5.1.6
`Data-centric . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`5.1.7
`Multi-threaded .......................... .
`5.1.8
`Other Compute Models . . . . . . . . . . . . . . . . . . . . . .
`System Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`Streaming Dataflow . . . . . . . . . . . . . . . . . . . . . . . .
`5.2.1
`Sequential Control . . . . . . . . . . . . . . . . . . . . . . . . .
`5.2.2
`5.2.3
`Bulk Synchronous Parallelism . . . . . . . . . . . . . . . . .
`Data Parallel . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`5.2.4
`5.2.5
`Cellular Automata . . . . . . . . . . . . . . . . . . . . . . . . .
`5.2.6
`Multi-threaded . . . . . . . . . . . . . . . . . . . . . . . . . . .
`Hierarchical Composition . . . . . . . . . . . . . . . . . . . .
`5.2.7
`Refe