`opposite is occurring—the template is hard-wired into
`the FPGA while the image pixels are clocked past it.
`Anorher important opportunity for increased effi-
`ciency lies in the potential to combine multiple templates
`on a single FPGA. The simplest way to do this is to spa-
`tially partition the FPGA into several smaller blocks, each
`ofwhich handles the logic for a single template. Alterna~
`lively, one can seek to identify templates having some
`topological commonality, and winch can therefore share
`parts of adder trees. This is illustrated in Big. 11, which
`
`shows two templates that share several pixels in common,
`and which can be mapped using a set of adder trees-that
`leverage this overlap- The advantage of using FPGAs is
`that FPGAs can be dynamically optimized at the gate
`level
`to exploit template characteristics. A gen-
`eral—purpose correlatot would have to provide large gen-
`eral—purpose adder trees to handle the sunnrfing of all
`possible template bits. The FPGA, however, exploits the
`sparse nature of the templates, and only constructs the
`small adder trees required. FPGAs can Exploit other fac-
`tors such as collapsing adder trees With common ele-
`
`
`
`
`
`
`
`
`
`
`
`
`
`‘P‘B
`
`IEEE SIGNAL PROCESSING MAGAZINE
`
`SEPTEMBER 1993
`
`Petitioner Microsoft Corporation - Ex. 1066, p. 158
`Petitioner Microsoft Corporation - EX. 1066, p. 158
`——-—-——————______________________
`
`
`
`
`
`Template 1
`
`
`
` l____ _l t_
`
`
`A I0. Example binary template with five ”on”pixels (rap) and
`A l l. Template commonalities are exploited to reduce hardware
`corresponding adder flee (bottom).
`toqulrements for computing multiple correlations.
`
`meats, and storing pixels that are not needed by the adder
`trees using RAM~basod shift registers.
`Table 2 illustrates the FPGA resource trade-offs in-
`volved in template mapping. The table gives the FPGA
`utilization for the Xilinx 4062 when four through seven
`template pairs are simultaneously mapped into the FPGA
`using the approach described above. Each template pair
`consists ot'hivo 32 x 32 binary images and is represented
`in the hardware using two template-specific adder trees.
`The number oftemplates per second thatcan becvaluated
`using this approach is a function of many factors includ-
`ing the clock rate, the FPGA configuration time,
`the
`number oftemplates per configuration, the candidate im-
`age and target sizes, the number ofclock cycles needed to
`evaluate the templates at each relative imageltcmplatc oil'-
`set, and on I/O considerations. The performance can be
`upper bounded by assuming that the 1/0 is fully efficient;
`i.e., that the FPGA is always either computing correla-
`tions or being reconfigured. Assuming efficient MO is
`fairly reasonable in the prototype systems we have con—
`stmcted, we have been able to avoid letting the FPGA be
`idle by using scaled down versions of the templates.
`When all ofthese factors are considered together, we find
`that configuration can consume more time than compu-
`tation; i.e. there is a significant perfiirmance penalty due
`to reconfiguration. This overhead will diminish to 10%
`or less when partially reconfigurable FPGAs become
`more widely available. However, for parts that are not
`partially reconfigurable,
`the benefits of increased
`computation power offered by larger FPGAs are to some
`extent mitigated by the larger configuration bitstreants
`and longer reconfiguration times that these parm require.
`
`SEPTEMBER 1 998
`
`Figure 12 shows a configurable computing board
`that was construcred at UCLA as a prototype for the
`template-matching problem. The board contains a “dy—
`namic” FPGA that is used for template correlations and
`is run—time reconfigured, a “static” FPGA for control,
`SRAM for storage of pixels and results, EPROM for
`configuration bitstream storage, and an interface to an
`i960 embedded processor for more advanced configura-
`tion control.
`
`Ongoing Research
`
`Configurable computing has growu from a field with a
`handful of researchers in 1989 [45] Lo one that now re-
`ceives the attention of hundreds of researchers and engi-
`neers in academia,
`industry, defense, and a rapidly
`increasing number of start—up companies. In this section
`we identify some of the open issues in this field and de-
`scribe selected recent and ongoing projects that aim to ad-
`dress them.
`
`One ofthe most interesting questions in configurable
`computing concerns the exrent to which current FPGA
`device and machine architectures should be altered to
`better support computing as opPOsed to the prototyping
`that drove much of the early evolution of FPGAs. Aca-
`demic researchers pursuing this question face the obvious
`challenge of being unable to fully exploit the existing in-
`fiastmcrure of commercial FPGAs and design tools. and
`typically design custom FPGAs to validate their architec-
`ture proposals. Various projects are underway, each at—
`tacking one or more of the well-known weaknesses of
`commercial FPGAs. For example, some researchers are
`
`IEEE SlGNAI. PROCESSING ”WINE
`
`T9
`
`Petitioner Microsoft Corporation - Ex. 1066, p. 159
`Petitioner Microsoft Corporation - EX. 1066, p. 159
`———————————_______________________
`
`
`
`
`
`
`
`investigating architecnnes that are based on relatively
`wide 16-bit datapaths (as opposed to the 1-bit datapaths
`found in today’s FPGAs). While less flexiblethan FPGAs,
`these computing devices are much more efficient in sili-
`con terms and achieve higher arithmetic performance on
`16—bit mmger data. Other Inscarchcrs are hivestiganng
`novel configuration approaches that either reduce config-
`uration time through context-switching or that distribute
`configuration data with data to be processed. Still other
`researchers are merging general~purpose processors and
`FPGA resources on the same die in an attempt to com-
`bine the besr features of both technologies.
`Peter Athanas’ group at Virginia Tech is exploring
`16-bit computing devices based on the “wormhole” tech—
`nique: a computing approach that dish-ibutes configura»
`tion data with the data to be processed [33]. Consisting
`ofa single multiplier and a 4 x 4 array ofI 6—bit arithmetic
`logic units (Allis) interconnecncd by a crossbar, their
`COLT device combines configuration data and data into
`a single packet. Resemhljng dataflow computing in many
`aspects, configuration data in one packet are used to route
`data through the array and to configure ALUs for subse-
`quent processing. COLT has been fabricated and is cur-
`rently being tested.
`Carl Ebeling‘s group at the University ofWashington
`is working on RAPID, another device based on 16-bit
`datapaths [14]. A RAPID arrayeonsisrs ofa mosrlylinear
`array of RAPID cells, each cell consisting of an integer
`multiplier, three integer ALUs, six registers, and finer
`small memories. RAPID is primarilysntically configured
`but uses limited dynamic control to provide run—time
`flexibility.
`Matrix, developed by Andre DeHon and others at
`MIT, is based on a cell that can serve as an instruction
`store, a memory element, or a computational clement. All
`datapaths are 8-bit and these cells are interconnected
`with multilevel interconnect that can be used both for
`data and instruction distribution. Matrix is currently on-
`dergoing commercial development by a new startup
`company, Silicon Spice. Del-Ion has also conducted an
`in—depth study that sets FPGA—based computing in a
`
`general-purpose computing context and has suggested
`several machine architectures [12.].
`:
`FPGA vendors are also pursuing their own research
`and development projects as well as more aggressive fab-
`rication processes. For example, over the next taro years,
`devices using supply voltages of2.5 Volts and below will
`become common. In addition, the technology lag of
`FPGAs with respect to ASICS in terms offeature sine and
`number of metal layers is rapidly shrinking, with
`3—5~layer FPGAs fabricated using sub .3 micron technol~
`ogy expecred to become common. The vendors are also
`likely to both introduce and adopt arcifiteCtural irniova-
`tions that have shown promise in academic research.
`Since future FPGAs will track ASIC technology more
`closely and will benefit from a richer set of architectural
`features, they are likely to compare more favorabiygwith
`ASICs for many applications than that of today.
`:
`The BRASS projecr at U.C. Berkeley under John
`Wawrzynelc [22] is developing a single chip (Garp) that
`incorporates a MIPS~II processor and an FPGA: core
`Whose elements roughlycorrcspond to those found in the
`Xilinx 4000 series. The BRASS researchers have modified
`the bill’s-II protessor, replacing the floating-point unit
`with an FPGA core of their own design, and have ang~
`mcnted the instruction set to include operations that
`manage the FPGA resources. Their goal is to execute
`data-intensive operations on the FPGA core andileave
`general-pnrpoae operations on the processor. A related
`effort at National Semiconductor Corporation is build—
`ing an FPGA thatwill combine a programmable proces-
`sor and a fine—grained FPGA on the same chip [13].
`Other researchers are investigating solutions to hinting
`eonfigtnable computing. elements with more traditional
`processors. For example, Ian Rabaey of U.C. Berkeley
`has examined die allocation of tasks in typical digital sig-
`nal processing and has proposed amultigtanulan‘tyarehi—
`tecture that alans computations to he directed to the
`hardWare that best supports them [34]. Rabaey is also in-
`vestigating strategies for low-power FPGAS. Though
`some power reduction will occur automatically due to
`technology changes, there is substantial opportunity to
`
`30
`
`IEEE SIC-NM. PROCESSING MAGAZINE
`
`SEPTEMBER 1993
`
`Petitioner Microsoft Corporation - Ex. 1066, p. 160
`Petitioner Microsoft Corporation - EX. 1066, p. 160
`—-——-————________________________
`
`
`
`redesign the logical units in FPGAs with powcras a prin-
`demonstrated applications in the following areas: neural
`cipal constraint.
`networks [15], morphology [48], ATR [35] and genetic
`Several groups are looking at FPGAs that have multi—
`algorithms [19]. BYU has also developed a variety ofde—
`ple configurations, or contexts, stored on-chip simulta-
`sign and implementation strategies [25] and provides tu—
`neously. At any given time one context is active and the
`torials for many different FPGA platforms via their web
`others are stored in Iowa planes. Contexts can be
`site: http:f;’splish.ee.byu.edu. A large bibliography ofre~
`swapped extremely quickly—“requiring from one to sev—
`lated papers is also available at this site.
`eral hundred clock cycles to complete—pocentially elimi—
`BYU‘S early research agenda was twofold: one, deter-
`nating much of the overhead involved in loading
`mine what characreristics make an application a good
`configuration bitstreams from off~chip. Of course, con-
`candidate for implementation on an FPGA—based com—
`text switching involves other overheads such as the rc~
`pucing platform, and two, research and understand the
`sources needed tohold multiple contexts on-chip, and the
`strengths and weaknesses ofcurrent devices, system orga—
`hardware and tools to manage context~switching The
`nizations, and tools. Following up on this basic research,
`earliest Work on contest-snatched FPGAs was done at
`BYU is now in the process of developing new system. or-
`Xilinx beginning in 1991, though it remained proprietary
`ganizations and application-development strategies that
`until very recently [42]. In the academic community con-
`are based upon high-performance circuit libraries, do-
`text switching was studied by Tom Knight, DeHon, and
`main-specific compilation, and RTR. BYU also contin—
`their colleagues at MIT [11, All].
`ues to experiment with applications in an effort to find
`Work to develop new configurable computing devices
`additional applications that can exploit this technology.
`also benefits from an understanding of how algorithms
`Iohn Villascnor and his colleagues at UCLA have
`map intoche range ot'architecnires represented by today’s
`demonstrated a video conununications system in which a
`FPGAs and FPGA systems. Some of the most extensive
`single SOOO-gate FPGA was reconfigured four times per
`algorithm mapping work has been performed by the
`image frame to allow compression and transmission ofan
`BYU group led by Brad Hutchings, which has experi-
`image [26]. The Mojave project at UCLA, led by John
`mented with most commercially available (and noncom—
`Villasenor and Bill Mangione-Smjth, has resulted in sev-
`mercially available} FPGAs as well as prototype systems
`eral generations of boards and domain-specific design li-
`such as the HP Terarnac [2] and Splash—2 [4]. BYU has
`braries for the ATE. application described previously.
`These boards included an interface to
`an embedded processor that. per-
`formcd on-the-fly analysis of results
`and modified the PPGA configura-
`tion sequence accordingly [43, 4-4].
`Researchers including Mohammad
`Shajaan and John Sorenscn of die
`Technical University of Denmark
`[38] have examined architectures for
`performing digital filtering using
`FPGAs. Because today’s FPGAs per-
`form multiplications poorly, much of
`the attention in filtering using FPGAs
`has focused on middply—fiee imple-
`mentations. In the funire, it is also
`likely that adaptive filtering algo-
`rithms will find application in FPGAS
`disc are partially reconfigured as the
`filter coefficients evolve.
`Anorhet area of research focus is in
`compilers and tools for configurable
`computing platforms. Ian Page ofOx-
`ford University has developed Han—
`del, a programming language that
`allows programmers to simulta-
`neously develop the FPGA eiucuit cle~
`scriptions and processor software
`with a single description language
`based on OCCAM. [31}. Reine:-
`Hartenstein of the University of
`
`
`
`
`
`A :2. a configurable computing board forflfR Duittat UCLA. the board Includes a “the
`nomic" FPGA that implements template correlations and is rapidly reconfigured during
`and an mom for
`execution, a ”static” FPGA for control, SRAM for image data storage,
`configuration bftstream storage. The board resides in a host PC and
`receives images
`across a PC: bus.
`
`SEWEMBER 1998
`
`IEEE SIGNAL PROCESSING MAGAZINE
`
`8]
`
`Petitioner Microsoft Corporation - Ex. 1066, p. 161
`Petitioner Microsoft Corporation - EX. 1066, p. 161
`——————————.____________________
`
`
`
`Kaisetslautern has developed a machine-level abstrac-
`tion called the Xputer [21] that also derives the target
`machine description and its program From the same de-
`scription. Wayne Luk. at Imperial College is investigat-
`ing formal approaches to FPGA design based on the
`language RUBY [29]. Transmogrifier-C [l7], devel—
`oped by researchers. at the University ofToronto, is a
`programming approach targeted at Toronto’s TM-Z
`custom platform, which is currently under develop-
`ment [27]. Anant Agarwal and his col—
`leagues at MIT are working on a utomated
`programming approaches for very large
`configurable—computing platforms [6].
`In addition, HP developed a very
`easy-to-usc compiler for their Terarnac
`system that automatically partitioned,
`placed, and routed 'a netlist of l—million
`gates into the nearly 1000 custom FPGAs
`that formed Teramac [2].
`Configurable computing is represented
`by a growing presence in the commercial
`world. In addition to FPGA vendors in-
`cluding Xilins and Alters there is a rapidly growing list of
`start-up companies with producrs that are based on
`configurable computing. These including Annapolis
`Microsystems of Annapolis Maryland, which common
`cialized the SPLASH-U architecture; Virtual Computing
`Corporation of Reseda, California; Morphologic of
`Nashua, New Hampshire; and Giga Operations of
`Berkeley.
`
`The lack of a sufficiently general high-level software
`programming model is of course a well-known problem
`among researchers performing work in configurable
`computing, and there are many ongoing efforts in which
`creation of a design tool infrastructure is a goal. EVen if
`such languages can be developed, tested, and adopted1
`there remains die problem ofthe “compiler,” whiCh in the
`domain of FPGAS means the tool chain That translates a
`filnttional or structural description ofthe task into aton—
`
`M C
`
`onfigurable computing is likely to benefit
`from architectural innovations both
`'
`in FPGAs and in the hardware to
`interface to them.
`
`figuration bitstteam that fiilly describes the circuit in the
`FPGA. PPGA place and route tools have always benefited
`from place and route teclmiques used in ASIC design,
`which involves many ofthe same challenges and tradeofih
`interms ofcloclt speed, design complexity, etc. Herrera,
`the several hours needed by current-generation conirner—
`cial FPGA tools to synthesize, place, and route a design
`on an PPGA, while fasr when viewed in the context of
`ASIC design, are unacceptably slow when compared to
`software compilers. To make configurable compllting
`practical will require that FPGA place and route tools he
`made faS‘ECl' by several orders ofmagnitude, most liliiely at
`the cost of highly suboptimal mappings of tasks into
`hardware. One exciting, but as yet unproven, appioach
`Illat has been advocated by William Mangione~Smith of
`UCLA is dynamic compilation, in which small units of
`precompiled FPGA configuration bitstreams can bcicom-
`bined extremely quicldy at run time to constitutee fiJll
`FPGA configuration bitstteam. There are many: chal-
`lenges in dynanuc compilation, not the least of “filth is
`the proprietary nature of configuration bitstteanas for
`most commercial FPGfls.
`'
`.As configurable computing advances it is also impor—
`tant to distinguish techniques that are truly new, such as
`large—scale run—time hardware reconfigurationLfi-om
`techniques that have existed in computing For imany
`years. Many of the “new” approaches in configurable
`computing are in fact existing computing concepts that
`are being implemented in a new domain. For example,
`the ATR algorithm described previously gains its effi—
`ciency from RTR, which can legitimately be claimed as an
`innovation due to configurable computing, and; from
`mapping target templates into template—specific adder
`trees, which is an example of the years~old technique of
`partial evaluation.
`
`Conclusions and Future Directions
`
`It is now clear that for applications requiring deeply
`pipeline-d, highly parallel, bit—level operations including
`cryptography, target recognition, and some types of im-
`age prOcessing, configurable computing machines ofi‘er
`compelling Speed and cost advantages over alternative
`implementations. For these types of applications,
`configurable computing machines are likely to become
`solutions of choice. What is less clear is the extent to
`which configurable computing techniques will become
`useful in more general computing environments, in par
`ticular for applications that inmlve high arithmetic com~
`plexity. Given the dominance and ever—increasing
`(zip-abilities ofmicroprocessors for general-purpose com-
`puting, it seems highly unlikely that any other Computing
`model, including that offered by configurable compue
`log, will make significant inroads against microploccs—
`sors in the foreseeable futul‘t. Widespread adoption of
`configurable computing is also hampered by the lack of
`exactly what microprocessors possess in abundance: 21 set
`ofrelatively easy to use, widely known software program—
`ming languages and associated compilers or interpreters
`that allow a user with little or no knowledge ofthe under-
`lying hardware to instruct a computing platform to per-
`form a desired task.
`
`32
`
`IEEE SIGNAL PROCESSING MAGAZINE
`
`SEPTEMBER 1 933
`
`
`
`
`
`Petitioner Microsoft Corporation - Ex. 1066, p. 162
`Petitioner Microsoft Corporation - EX. 1066, p. 162
`-———————_________________________
`
`
`
`In addition to due obvious trend toward larger devices,
`configurable computing is likely to benefit from architec—
`rural innovations both in FPGAs and in the hardware to
`interface to them. Configurable computing is a young
`field with enormous potential to grow as FPGAs, their
`derivatives, and the tools to use them advance. The
`FPGAs that will be emerging in the next Few years will be
`in excess ofhalf a million equivalent gates, which is large
`enough to support a very diverse range ofapplications. In
`addition,
`the state of the art in architectures for
`configurable computing devices will be significantly en-
`riched by the many ongoing research efl‘orts studying ar-
`clutccturc issues. Existing and perhaps new vendors of
`configurable computing devices, who are now well aware
`of the potential of configurable computing, can be my
`peered produce devices and the associated tools that will
`make FPGAS oftoday look primitive.
`
`John Wiles-om is a Professor at UCLA‘s Electrical Engi-
`neering Department in Los Angeles, California. Brim!
`Hatching: is an Associate Professor at Brigham Young
`University’s Electrical and Computer Engineering De—
`partment in Provo, Utah.
`
`References
`
`1. Al. Abbott, PM. Athanas, L. Chm. and ILL. Elliott, “Finding lines and
`building pfi‘amida with Splash 3".13 DA. Bucll and K. L. Porxk, editors,
`Masonry! qf‘IEEE Warthog: wt FMJER' 0mm WWMM, pp.
`155-151, Napa, CA, flprii 1994:.
`
`2. R. Amerson, R. Carter. 8. Ctdberteon. P. Kueltee, and G. Snider, "I'etamae
`- configurable custom computing.” Lu DJL Buell and K. L. Poeek, editors,
`Wags @1535 WWW} m FWIfir Curfew Compnmmim, pp.
`32—33. Naps, oat. April [995.
`
`3. LM. Arnold. "The Splash 3 software environment.“ In ILA. Buell and KL.
`Pooch, Editors, ”emailing: #1551? Warthog m FMfir Custom Cam-
`pnbing Meet, pp. 88-93, Napa, CA, april 1993.
`
`4. IM. Arnold, DA. Bud], and E.G, Davis, “Splash 2.” Ln Hamming: gm»: 4d:
`Ammo! ACM W'm um ParadigmWm Mair-drummers, pp.
`316—324, June 1992.
`
`5. PM. Aromas anti ILL. Abbott, “Real-time image processing on a custom
`computing platform‘LEEE Chimera-r, 23(2): 16-24, February 1995.
`6. ]. Babb, M. Prank, E. Waingold, and R. Banana, ”The MW benchmark
`suite: (hmputation structures lint general purpose mpu‘ting.” 111 IM. Ar-
`nold and KL. l’ocek, editors, Meetings qflEHE Workshop on {FPG‘ilrjiir
`(Imam Cowpnfiigmm, Napa, CA, April 1997, no he published.
`
`7. P. Benin, D. Ronein, and J. Vaillemui. “Intmduetinn to progiarnmable ac~
`rive memories.” In I. McCarmy, J. Mewhirrher, and E. Swattslaueler 11;, ed-
`itors, fimfirAmyfiwmm-r, pp. 500-309. Prentice Hall. 1989.
`
`8. G. Brenner, “The Mppabl: logic unit a paradigm for virtual hardware.” In
`I-M. Arnold and KL. Poeek, editors,MW quEE-JE Wwfinp an
`FPGdrfiar rm ()0me Manners, Napa, CA, April 199?, to be put:-
`llshcd.
`
`U. I. Euros, A. [Saudi-1,]. “"58, 3. Sir-15h, sunlM. :1: “93:, “A dynanli‘t. mun-
`figurarion run-time system.” In IM. Arnold and. KL. Porch editors, Pm—
`m'aga rim WarMap a» FPGAJfirr Carmen Cmpannggfifeckinrr, Napa,
`CA, April 199?, to be published.
`
`II). on. Clark and 13.1.. Hun-Jainga, “Supporting FPGA minoprocesaors
`duough mat-gentile software tools.“ In I. 11de and KL. Poeek, editors,
`
`SEPTEMBER 1993
`
`Pmtexa'iqgtaffliEE Wei-hem M FMfir Cum mmwmm. pp.
`195—205, Napa. CA, April 1996.
`
`l L A. Del—Ion, “DPGn-eoupled microprocessors: Commodity 1C: for the
`early 21" cenmry.” I‘n DA, Bold] and KL. Poet-k, editors,W of
`Liza's Women, or! Fina-11m CmWW. PP. 31.39, Napa,
`CA. April 1994.
`
`12. A. Del-ton, Rosenfigtuable Mel-titecturet for General-Purpose Computing.
`PhD thesis. Massachusetts institute ofTeclmology, September 1996.
`13. T. Draper, W. King, J. Trout, and R. Connors. “MORRI’H: A Modular
`and reprogrammble teal~timc maulg hardware." In DA. Buell and KL.
`Pooek, editors, Warming: stEEE WWW rm Fmfirr Cmm Catw-
`”momma. pp. 11-19, Napa, CA, April 1995.
`
`14. C. Ebeling, D.C. Cronqujst, and P. Franklin, “RaI'fD - reconfigurable
`pipflined datapeth.” In ].M. Amold and KL. Pooch, odjm, Interleukin-MI
`WWW rm Hold—We Limit, FPL’Qfi, pp. 126—135, Darrostadt,
`Germany, September 1996.
`
`15. J.G. Eldredge and. B.L. H etchings, ‘Run—o‘me reconfigtuatiou: A method
`for enhancing the fimctioual density ofSRAM-bnsed FPGAs.’]mmal g“
`VLSISTgualeeeuirin, vol. 12, pp. 6736, 1996.
`
`16. CW. Fraser and D. Hausa-11,21 listings-male C (3%.
`Beniamjnmummings Company, 1995. ISBN 0-8053-1670-1.
`
`1?. D. Galloway, I"The transnwgrifie-r [3 hardware dweription language and
`compiler for FPGAs.” In DA. But-ll and KL. Pocek, editors, Proceedings
`ol'l'EEE Workshop on FPGAs for Custom Computing Maclu'nm, pp.
`136.144, Napa, ca, April 1995.
`
`18. M. Gold-tale and E. Gomersall. ”High-level compilation for fine gamed
`PW. 1111M. Amoldand KJ... Poeclt, odimJ’mdm q'IEEE Whir-
`Ilmp rm FMfierm cmmamm, Napa, CA, Apeil 199?, to
`be published.
`
`l9. 1’. Graham and B. Nelson, “A hardware genetic algorithm for die mulling
`salesnmi problem on SPLASH 23111 W. Moon: and W. Luk. editors,
`Halli—WW fagieenddppit’mflbm, pp. 352-36] , Springer, Worst
`England, lingual: 1995.
`
`ll]. 1’. Graham and [1. Nelson, “Gated: algorithms in software and in hardware
`A pertbrmanee analysis ofworhttation and custom computing machine
`implanntatiem.” In I. Arnold and K. Pocek, oditots,Mmiiugi «fHSEE
`Workrbup an FPGArfir Custom (301)1qu Marshes, pp. 216425, Napa,
`CA, April [995.
`
`21. ILW. 11mm, no. Hirschbicl, M. Riedmullet, K. Scl'lmidt. and M.
`Weber, “A novel ASIC design approach based on a new machine para-
`1991
`digm.” EEEJwe-uai n"Sohli-.S‘tmr Circuits, vol. 26, no. 7, pp. 93339, Jul)
`
`12. LR. Hauser and I. Wamynelt, “Gar-p: A processor with .1 reconfigurable
`ooprocusor.” in 1.1141. Arnold and KJ... Porch, editors, Mei-dog: afLERE
`Workshop rm FPGAsfir (3mm CowputngMacln’uet. Napa. CA, April 199?,
`to be published.
`
`13.13an Von Her-zen, “Signal processing at 250 MHz using
`high-perihmunee FPGAs.“ InACZlIUflE-Zfld Into-natives! awn on
`Whyka GuteAmys, pp. 62-68, Mann-my. CA, Febmary 1997.
`24-. D.'I'. Hoang. “Searching genetic databases on Splash 2,.” In D. A. End] and
`K.L. Pooch. odinors. Receding: weer Work-bah an FI’GAIfir Cm
`Gwmgfimcfim, pp. 185-191, Napa, CA, April 1993.
`
`25. BL. Hutchiogs and MJ. Wirdiliu, “Implementation approaches For recon-
`figurable logic applications.” In W. Moottand W. Lule editors,
`Field-W ngic MAWM'M, pp. 419423. Springer, Oxford,
`England. August 1995.
`
`2.6. B. Sehoner J. Villaaenor and C. Jones, "Video communications using rap—
`idly reconfigurable hardware.” EEK Tm. rm Cimu'n rm! Svmrfin’ View
`Technolw, pp. 565-567, Dt‘em'nhar [997.
`
`37. 11M. Levvis, 0.11. Galloway, M. van Ietsetcl, I. Rose, and 1-'. Chow, “The
`rnmsmognfiera: A 1 million gate rapid prototyping sysmm.” In
`
`IEEE SIGNAL PROCESSING MAGAZINE
`
`5:5
`
`Petitioner Microsoft Corporation - Ex. 1066, p. 163
`Petitioner Microsoft Corporation - EX
`. 1066, p. 163
`———————————_________________________
`
`
`
`ACMSIGIH 13mmW on FMWW (mm,
`pp. 53-61, Montcary, CA, February 199?.
`
`28. DJ’. Inprcsn', “Rapid implementation dramatic ssqucmc compammr
`using field-programmable gab: arrays.” In C. Sequin, cditor,AdmmRs-
`mm is v1.31 may: qftfiz 1991 my q‘Cmfg‘bnmi‘m sz
`Wm, pp. 138451851151 Cruz, CA, March 1991.
`
`29. W. Luk, “A dcdir‘afive approach to mama-mm! custom computing.” In
`DA. Bucll and ILL. 1%:ch Dditms, WW goats}.- WM 091 FPGA:
`finr Ctmm Cmpm'gmm, pp. 164—172, Napa, CA, Apr” 1995.
`
`30. P. Lysaght and}. Smckwood, “A simtflafion tool fordymnflcafly reconfig-
`urable fidd pmgramnmblcgm arrays.”IEI-LH mm on mengv
`Sm? {mag-wan (VLSUQM, vol. 4, no. 3, pp. 331-390, Sepumltr
`I996.
`
`31. I. Page and. W. Luk, “Compiling occam mm FPS-As.” InW Inma—
`n'oml War-hm rm Fisk!WW Logic WAme, pp. 2331-283,
`Oxfiord1 UK.l September1991.
`
`31. GM. Quenor. L ijjit, I. Strut, and B. Zavidoviqut, “a rccoufigul'ablc
`Omnpm engine for meal—time vision aummgm pmmq'pmg.”1n 11A. Bucll
`and KL. [’0ch adhors, PracesdfigmeEEE WW 9» FPGAJfir Gum»
`Wmsfium pp. 91400. Nam. CA. April 1994.
`
`33. ‘11. Bimcrlr. and PM. Athanns, “Computing kernels implemented with a
`wolmholc KTRCCM.” In }.M. Arnold and Kl. Poock, dim,W
`{#11385 Wow-£1.69? m Wfiw Cam WWW-‘3‘ Napa, (3.6.,
`April 1997, to be published.
`
`34. 1M. Rabacy, “Romnfigurabl: processing: The aolution to low—power Pro-
`gmmmblc USP.” In Nudity: ofICASSP’S’Z Munich, Germany. Agril
`199?. to be: published.
`-
`
`35. M. Ranhcr and 33.1.. Humhings. “Automated target: mgrfition on
`SMASH-2.“ I11 LM. Arnold and KI... Pooek, editors, Mmiwqflfififi
`Workup rm Hmsfir 0mm»Wham Napa, CA, April 1997,
`to be pubiislmd.
`
`36. D. Row, 0. Vellaoona. and M. Tux-1m, “An FPGA-bascd hardm accelera-
`tor for imag: pwrcssing.” In W. Moon: and W. Lain. edit-ms, Mm: PM:
`Monika; gm; £993 Rimming! WWWWW mic and
`«mm»; pp. 199—306,Oxfi3rd, England. September 1993.
`
`3?. H. Submit, ‘Inuamcntal raconfigumdon for pipcfined applkadom.” In
`J‘M. Arnold and KL. Pomk, editors, 5605;569:0me Warts/mp on
`mwfivr 6mm (21319145ng Napa, CA7 April 199?, to appear.
`38. M. Shziaan, K Hickman, and 11A Sum-man, ‘T'Lme-arca cfiicimt mum'-
`piier-frcc film: nahimcwrcs For FPGA implementation.” BMW of
`
`IFfiImfioud Cerg‘mmdmmh; 59m, MWWQ»
`firm“, pp. 32513254; 1995.
`
`39. M. Shand, ‘Flcxibie image acquisition using monfigamblc hardware.” in
`PM. Animus andKL. Pocck, udimrs,LEEE WWwFKwfierm
`mwmm, pp. 125—134, Napa, an, April 1995.
`9:0. N. Shimfi. W. LulaJ and P. Chem-1g, “Compilation tools For run—time racon-
`figumblc design.” In IM. Arnold. and ICJ... Pocck, editors, qu
`may WW0» maammm WWW Napa, CA, April
`1997, m be publishcd.
`'
`
`4.1. Esra-1 D. Chen-,1. Saudi, I. Brown, and A. [RT-lion1 “A first gcnnafiun
`DPGA implcmcumtion.”frs FPD’W — ma (Imam WW :9"
`I
`Ecfll’fognmbk Danica, pp. 7138-143, May 1995.
`
`42. S. Timber-gar, D. Carbcrry, A. Iahnson, and I. Wang, “A
`dun—multiplexed FPGA.” In 1M. Arnold and KL. Pocek, editors. Pmmd—
`lugs ofEEEE Workshop on FPGAs for [Imam Compmng Mathias,
`Napa, CA, April 1997.
`
`43. I. Villasmm and W. Mangionc-Smfm. Configm-abl: computing. Scignfific
`American pp. 66-71,]un: 199?.
`
`44.1. Viflaaenor, B. Schmu, KN. cm, (2. Zapata, HJ. Kim, c. Tomas.
`Lansing, and B. Manginneafimich, ”Configurable cumpufingsoludms For
`automatic target ramgninion.” In I. humid and K. L. Pooch. editorsgi’m-
`Muffin? Wat-firm? mflfimfirCz-um WHMMW, pp.
`70-79, Napa. CA. April 1996.
`'
`
`45. J. Vufllcmjn, I’. Benin, D. Remix}, M. Slam-rd1 II. Tomt'r. and P. Bozhgud.
`“Programmable: active memories: ernfigmblc systems mm: of age.”
`IRES Wren VLSISym, vol. 4m). 1, pp. 56-69, 19915.
`4?. MJ. W'irmljn and B.L. Humbings, “DISC: The dynamic insuuctiunlser
`computer.” In I. Scheme], cdimr,My; 9!“?! hem-madam! may of
`033W Waring- {SPIE}. FWH-Wmakk GanArmy! {FPGAJ}fiIr
`Fart Emmi Bridgman:WWW (Swampy, vol. 2607, pp.
`92-103, Phfladtiphia. PA, Ocmbcr 1995.
`
`4-7. M,]". Wis-mun and 8.1.. Humhings, A dynamic instruction set oomputcr.”
`IIJP. Adana; and ILL. Pooch Editors, WWW WWW
`{PPGz‘UIfir Cm Cmrfigmmr, pp. 9940?. Napa. CA. Afiril
`1995.
`I
`
`48. MJ. Winhlin and 3.1... Hutchings, ‘Saqumcing run-firm: moorrfigurcd
`hardware with software." In ACMEIGDA 1%“ Wm: on M
`We mm, pp. 122—123, Mommy, CA, Pcbmary 1996.
`$9. MJ. Wirfl'ds'n and B1. Hun-hing, “Enproviug firnctioml density Through
`run-time. constant promgnfion.” In AWSIGDAmmW on
`Fin-Id W121: Gaming-5, pp. 86-92, Moumcy, CA, Ftbrmry [997.
`
`
`
`84
`
`IEEE SIGNAL PROCESSING mums
`
`SEPTEMB ER 1 993
`
`Petitioner Microsoft Corporation - Ex. 1066, p. 164
`Petitioner Microsoft Corporation - EX. 1066, p. 164
`—————-—————.—__________________________
`
`
`
`(cid:3) (cid:3) (cid:3) (cid:3) (cid:3) (cid:3) (cid:3)
`
`(cid:36)(cid:87)(cid:87)(cid:68)(cid:70)(cid:75)(cid:80)(cid:72)(cid:81)(cid:87)(cid:3)(cid:22)(cid:36)(cid:3)
`
`Petitioner Microsoft Corporation - Ex. 1066, p. 165
`
`
`
`@
`
`THE iNSTITUTE OF ELECTRICAL
`
`T‘WEEE
`AN ELECTRONICS“€350,553 September 1987
`
`D
`
`fF’EClAL ISSUE ON
`
`I:
`
`.-
`
`irg1 .113? 1.1"
`
`.......
`
`:n,
`g.
`g.II
`-"
`
`
`Iid -.
`if.
`.
`'
`.
`I'-
`n -.
`*.fl
`.0
`
`'
`
`"“3“" -' __
`
`MIRA HALL 1.15.3
`;3-:. if I AL-‘J‘
`LNG-3T
`II 1 0‘3 CHL'J‘LQY 3T
`ELQHfLK-H‘» CITY
`
`INN 13 2-502
`
`_
`
`'
`'
`I‘J’Iij
`
`104-110
`
`‘ _
`
`" “-4
`
`-
`
`‘1“
`
`r
`
`-‘
`
`Petitioner Microsoft Corporation - Ex. 1066, p. 166
`etitioner Microsoft Corporation - EX. 1066, p. 166
`
`P\
`
`_"I§‘L‘.‘§J
`tflom:
`-'I|.I
`'
`.. ___a
`”I
`’
`:..:
`1;.UL“ '
`
`:.-
`
`.7"
`
`0 3
`
`-1
`
`
`
`proceedings iillEEE Q
`
`
`
`published monthly by the lostltut‘e of Electrical and Electronics Engineers, inc.
`september 1987
`
`SPECIAL ISSUE 0N
`
`HARDWARE AND SOFTWARE FOR DIGITAL SIGNAL PROCESSING
`
`What by Snoilt K. Mlira and Kalyan Mondal
`
`1139
`
`SCANNING THF :ssus
`
`PAPERS
`
`V15!
`
`The [MSMG Family of Digital Signal Processors, K—S. tin, G. A. aniz, and it. Simon Jr.
`1143
`VLSI Processor for Image Processing, M. Sugai, A. Kannma. K. Suzuki. .mri M. Noon
`1160
`1161? Digital Signal Processor for Test and h-‘Iflasttrement Environment, A. Kareem, C. l.. Ease.
`F. Elheri‘dge, and D. h-lclt’lnney.
`1112 The (Ir-aph Search Machine (GSA-ti: A VLSI Architecture for Connected Spec-ch Recognition anti
`Other Applications. .5. C. Clirtskl, T. M. Lalumia. D. R. Cnsslrlny, T. Koh, C. Cerveshi, G. A. Wilson,
`and l. Kumar
`
`1185 DSPfifilttilt: An Algorithm-Specific Digital Signal Processor Peripheral, C. D. Hillman
`1192
`Parallel Bit