`(10) Patent No.:
`(12) Unlted States Patent
`Huppenthal et a].
`(45) Date of Patent:
`*Nov. 17, 2009
`
`
`USOO7620800B2
`
`(54) MULTI-ADAPTIVE PROCESSING SYSTEMS
`AND TECHNIQUES FOR ENHANCING
`PARALLELISM AND PERFORMANCE OF
`COMPUTATIONAL FUNCTIONS
`
`(75)
`
`Inventors: Jon M. Huppenthal, Colorado Springs,
`CO (US); David E. Caliga, Colorado
`Springs, CO (US)
`
`(73) Assignee: SRC Computers, Inc., Colorado
`Springs, CO (US)
`
`( * ) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 95 days.
`
`This patent is subject to a terminal dis-
`claimer.
`
`(21) Appl.No.: 11/733,064
`.
`Flledi
`
`APE 9: 2007
`
`(22)
`
`(65)
`
`Prior Publication Data
`US 2007/0204131A1
`Aug. 30, 2007
`
`Related U.S.Application Data
`(63) Continuation of application No. 10/285,318, filed on
`Oct. 31, 2002, now Pat. No. 7,225,324.
`
`(51)
`
`Int. Cl.
`G06F 15/82
`712/226
`(52) US. Cl.
`(58) Field of ClasslficatlonSearch""""""""""""" 712/226
`712/15, 19, 215
`See application file for complete search history.
`
`(2006.01)
`
`(56)
`
`References Cited
`
`Us PATENT DOCUMENTS
`4 727 503 A
`2/1988 McWhirter
`4,763,294 A
`8/1988 Fong
`4,872,133 A
`10/1989 Leeland
`4,962,381 A
`10/1990 Helbig, Sr.
`
`5,020,059 A
`5,072,371 A
`5,230,057 A
`5,274,832 A
`5,471,627 A
`5,477,221 A
`
`5/1991 Gorin etal.
`12/1991 Benner et al.
`7/1993 Shido et 31.
`12/1993 Khan
`11/1995 Means et al.
`12/1995 Chang et a1.
`
`(Continued)
`FOREIGN PATENT DOCUMENTS
`
`JP
`
`59'206972
`
`11/1984
`
`.
`(Continued)
`OTHER PUBLICATIONS
`
`a1
`ital Si
`Gaudiot, Jean-Luc, Data-Driven Multicom uters in Di
`gn
`g
`P
`Processing,l987, IEEE, Proceedings 0fthe IEEE, vol. 75,N0. 9, pp.
`1220-1234.*
`
`d
`C t'
`( Onmue)
`Primary ExamineriEric Coleman
`(74) Attorney, Agent, or FirmiMichael C. Martensen;
`William J‘ Kubida; Hogan & Hartson LLP
`(57)
`ABSTRACT
`
`.
`.
`.
`.
`for
`Multi-adaptive process1ng systems and techmques
`enhancing parallelism and performance of computational
`functions are disclosed which can be employed in a myriad of
`applications including multi-dimensional pipeline computa-
`tions for seismic applications, search algorithms, information
`security, chemical and biological applications, filtering and
`the like as well as for systolic wavefront computations for
`fluid flow and structures analysis, bioinformatics etc. Some
`applications may also employ both the multi-dimensional
`pipeline and systolic wavefront methodologies disclosed.
`
`52 Claims, 20 Drawing Sheets
`
`93 O
`
`
`
`9‘?
`0..
`
`0:00'
`
`
`0:.
`C99’9
` o
`
`0:.0:.‘9.0
`.9
`
`0:0.:.‘30:0
`9 9 9 9
`:0 __0:.0:00:0e
`
`
`Awmuu.MINEmacaw:
`
`
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 1
`
`ADAPTIVE PROCESSOR CHIP
`
`
`
`INYEIOONNEG'
`
`
`
`Petitioner Microsoft Corporation - EX. 1005, p. 1
`
`
`
`US 7,620,800 B2
`
`Page 2
`
`U.S. PATENT DOCUMENTS
`
`5,509,134 A
`5,570,040 A
`5,640,586 A
`5,715,453 A
`5,737,766 A
`5,784,108 A
`5,802,290 A
`5,892,962 A
`5,903,771 A
`5,915,123 A
`5,953,502 A
`5,956,518 A
`5,966,534 A
`6,023,755 A
`6,052,773 A
`6,061,706 A
`6,076,152 A
`6,128,663 A
`6,192,439 B1
`6,215,898 B1
`6,226,776 B1
`6,289,440 B1
`6,385,757 B1
`6,704,816 B1
`6,721,884 B1
`
`4/1996 Fandrich etal.
`10/1996 Lytle etal.
`6/1997 Pechanek et al.
`2/1998 Stewart
`4/1998 Tan
`7/1998 Skaletzkyetal.
`9/1998 Casselman
`4/1999 Cloutier
`5/1999 Sgro et a1.
`6/1999 Mirskyetal.
`9/1999 Helbig, Sr.
`9/1999 DeHonet a1.
`10/1999 Cooke et a1.
`2/2000 Casselman
`4/2000 DeHon et a1.
`5/2000 Gaiet a1.
`6/2000 Huppenthalet al.
`10/2000 Thomas
`2/2001 Grunewald et al.
`4/2001 Woodfillet a1.
`5/2001 Panchulet a1.
`9/2001 Casselman
`5/2002 Gupta et a1.
`3/2004 Burke
`4/2004 De Oliveira Kastrup Pereira et al.
`
`FOREIGN PATENT DOCUMENTS
`
`JP
`
`63-086079
`
`4/1988
`
`OTHER PUBLICATIONS
`
`Dennis,J. B., Data Flow Supercomputers, Nov. 1980, IEEE, Com-
`puter, pp. 48-56.*
`Qunnn M.J., et a1., Data-Parallel Programming on Multicomputers,
`Sep. 1990, IEEE, pp. 69-76.*
`Trevleaven, P.C., et a1., Data-Driven and Demand-Driven Computer
`Architecture, 1982, ACM, Computiing Surveys vol. 14, No. 1, pp.
`93-143.*
`Webster. M., Webster’ s Ninth New Collegiate Dictionary, 1985, Mer-
`riam-Webster pub., p. 627.*
`Miyamori, Takashi, “REMARC: Reconfigurable Multimedia Array
`Coprocessor”, IEICE Transactions on Information and Systems,
`Information & Systems Society, Tokyo, JP, vol. E82-D, No. 2, Feb.
`1999, pp. 389-397, XP000821922.
`Gross Thomas, et a1., “Compilation for a High-performance Systolic
`Array”, Sigplan Notices USA, vol. 21, No. 7, Jul. 1986, pp. 27-38,
`XP002418625.
`Rauchwerger, Lawrence, et al., “The LRPD Test: Speculative Run-
`Time Parallelization of Loops with Privatization and Reduction
`Parallelization”, IEEE Transactions on Parallel and Distributed Sys-
`tems, IEEE Service Center, Los Alamitos, CA, vol. 10, No. 2, Feb.
`1999, pp. 160-180, XP000908318.
`Arnold Jeffrey M. et al., “The Splash 2 Processor and Applications”,
`Computer Design: VLSI in Computers and Processors, 1993, ICCD
`’93 Proceedings, 1993 IEEE International Conference on Cam-
`bridge, MA, Oct. 3-6, 1993, Los Alamitos, CA, IEEE Comput. Soc.,
`Oct. 3, 1993, pp. 482-485, XP010134571.
`Hwang, Kai, “Computer Architecture and Parallel Processing”, Data
`Flow Computers and VLSI Computations, 1985, McGraw Hill,
`Chapter 10, pp. 732-807, XP-002418655.
`Hartenstein, Reiner W., et al. “A Synthesis System for Bus-based
`Wavefront Array Architectures”, Proceedings, International Confer-
`ence on Application-Specific Systems, Architectures and Processors,
`1996, pp. 274-283, XP002132819.
`Alexander, Thomas, et al. “A Reconfigurable Approach To A Systolic
`Sorting Architecture”, ISCAS 89, May 8, 1989, pp. 1178-1182,
`XP010084477.
`Wu, Youfeng, et al. “Better Exploration ofRegion-Level Value Local-
`ity with Integrated Computation Reuse and Value Prediction”, Pro-
`ceedings of the 28th International Symposium on Computer Archi-
`tecture,
`ISCA 2001, Goteborg, Sweden, Jun. 30-Jul. 4, 2001,
`
`International Symposium on Computer Architecture, (ISCA), Los
`Alamitos, CA, IEEE Comp. Soc, US, Jun. 30, 2001, pp. 93-103,
`XP010552866.
`Babb, Jonathan, et a1., “Parallelizing applications into silicon”, ©
`1999 IEEE.
`Deshpande, Deepali, et a1., “Hybrid data/configuration caching for
`striped FPGAs” © 1999 IEEE.
`Purna, Karthikeya, et a1., “Temporal partitioning and sched uling data
`flow graphs for reconfigurable computers”, © 1999 IEEE, Publ. No,
`0018-9340/99, pp. 579-590.
`Gibbs, W. Wayt, “Blitzing bits”, © 1999 Scientific American Pre-
`sents, pp. 57-61.
`Gonzalez, Ricardo, “Configurable and extensible processors change
`system design”, Aug. 15-17, 1999, Hot Chips 11 Tutorials, pp. 135-
`146.
`Graham, Paul, et a1., “FPGA-based sonar processing”, © 1998 ACM
`0-89791-978-5/98, pp. 201-208.
`Hasebe, A., et a1., “Architecture of SIPS, a real time image processing
`system,” © 1988 IEEE, Publ. No. CH2603-9/88/0000/0621, pp. 621-
`630.
`Hammond, Lance, et al., “The Stanford Hydra CMP”, Aug. 15-17,
`1999 H ot Chips 11 Tutorials, pp. 23-31.
`Jean, Jack, et a1., “Dynamic reconfiguration to support concurrent
`applications”, © 1999 IEEE, Publ. No. 0018-9340/99, pp. 591-602.
`Kastrup, Bernardo, et a1., “Concise: a compiler-driven CPLD-based
`instruction set accelerator”, © 1999 IEEE.
`Motomura, Masato, et al., “An embedded DRAM-FPGA chip with
`instantaneous logic reconfiguration”, © 1998 IEEE, Publ. No.
`0-8186-8900-5/98, pp. 264-266.
`McConnell, Ray, “Massively parallel computing on the Fuzion chip”,
`Aug. 15-17, 1999, Hot Chips 11 Tutorials, pp. 83-94.
`McShane, Erik, et a1., “Functionally integrated systems on a chip:
`technologies, architectures, CAD tools, and applications”, © 1998
`IEEE, Pub1.No. 8-8186-8424-0/98, pp. 67-75.
`Rupp, Charley, et al., “The NAPA adaptive processing architecture”,
`© 1998 the Authors, pp. 1-10.
`Saito, Osamu, et al., “A 1M synapse self learnin g digital neural
`network chip”, ©0 1998 IEEE, Publ. No. 0-7803-4344/1/98, pp.
`94-95.
`
`Schott, Brian, et a1., “Architectures for system-level applications of
`adaptive computing”, © 1999 IEEE.
`Mencer, Oskar, et a1., “PAM-Blox: High Performance FPGA Design
`for Adaptive Computing”, © 1998 IEEE, Conference Paper, INSPEC
`Abstract Nos. B9611-1265B-044, C9811-5210-009.
`Miyamori, Takashi, et al., “A quantitative analysis of reconfigurable
`coprocessors for multimedia applications”, © 1998 IEEE, Confer-
`ence Paper, INSPECAbstractNos. B9811-1265F-011, C 9811-5310-
`010.
`
`Agarwal, A., et al., “The Raw Compiler Project”, pp. 1-12, http://cag-
`www.lcs.mit.edu/raw, Proceedings of the Second SUIF Compiler
`Workshop, Aug. 21-23, 1997.
`Albaharna, Osama, et a1., “On the viability of FPGA-based integrated
`coprocessors”, © 1996 IEEE, Publ. No. 0-8186-7548-9/96, pp. 206-
`215.
`
`Amerson, Rick, et a1., “TeramaciConfigurable Custom Comput-
`ing”, © 1995 IEEE, Publ. No. 0-8186-7086-X/95, pp. 32-38.
`Barthel, Dominique Aug. 25-26, 1997, “PVP a Parallel Video
`coProcessor”, Hot Chips IX, pp. 203-210.
`Bertin, Patrice, et a1., “Programmable active memories: a perfor-
`mance assessment”, © 1993 Massachusetts Institute of Technology,
`pp. 88-102.
`Bittner, Ray, et a1., “Computing kernels implemented with a worm-
`hole RTR CCM”, © 1997 IEEE, Publ. No. 0-8186-8159-4/97, pp.
`98-105.
`
`Buell, D., et al. “Splash 2: FPGAs in a Custom Computing
`MachineiChapter liCustom Computing Machines: An Introduc-
`tion”, pp. 1-11, http://www.computer.org/espress/catalog/bp07413/
`spls-ch1.html (originally believed published in J. of Supercomput-
`ing, vol. IX, 1995, pp. 219-230.
`Casselman, Steven, “Virtual Computing and The Virtual Com-
`puter”,© 1993 IEEE, Publ. No. 0-8186-3890-7193, pp. 43-48.
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 2
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 2
`
`
`
`US 7,620,800 B2
`
`Page 3
`
`Chan, Pak, et a1., “Architectural tradeoffs in field-programmable-
`device-based computing systems”, © 1993 IEEE, Publ, No. 0-8186-
`3890-7/93, pp. 152-161.
`Clark, David, et a1., “Supporting FPGA microprocessors through
`retargetable software tools”, © 1996 IEEE, Publ. No. 0-8186-7548-
`9/96, pp. 195-103.
`Cuccaro, Steven, et al., “The CM-2X: a hybrid CM-2/Xilink proto-
`type”, © 1993 IEEE, Publ. No. 0-8186-3890-7/93, pp. 121-130.
`Culbertson, W. Bruce, et a1., “Exploring architectures for volume
`visualization on the Teramac custom computer”, © 1996 IEEE, Publ.
`No. 0-8186-7548-9/96, pp. 80-88.
`Culbertson, W. Bruce, et a1., “Defect tolerance on the Teramac cus-
`tom computer”, © 1997 IEEE, Publ. No. 0-8186-8159-4/97, pp.
`1 16-123.
`Dehon, Andre, DPGA-Coupled microprocessors: commodity IC for
`the early 21“ century’,© 1994 IEEE, Publ. No. 0-8186-5490-2/94,
`pp. 31-39.
`Dehon, A., et a1., “Matrix a Reconfigurable Computing Device with
`Configurable Instruction Distribution”, Hot Chips IX, Aug. 25-26,
`1997, Stanford, California, MIT Artificial Intelligence Laboratory.
`Dhaussy, Philippe, et a1., “Global control synthesis for an MIMD/
`FPGA machine”, © 1994 IEEE, Publ. No. 0-8186-5490-2/94, pp.
`72-81.
`Elliott, Duncan, et a1., “Computational Ram: a memory-SIMD
`hybrid and its application to DSP”,© 1992 IEEE, Publ. No. 0-7803-
`0246-X/92, pp. 30.6.1-30.6.4.
`Fortes, Jose, et a1., “Systolic arrays, a survey of seven projects”, ©
`1987 IEEE, Pub1.No. 0018-9162/87/0700-0091, pp. 91-103.
`Gokhale, M., et a1., “Processing in Memory: The Terasys Massively
`Parallel PIM Array” © Apr. 1995, IEEE, pp. 23-31.
`Gunther, Bernard, et a1., “Assessing Document Relevance with Run-
`Time Reconfigurable Machines”, © 1996 IEEE, Publ. No. 0-8186-
`7548-9/96, pp. 10-17.
`Hagiwara, Hiroshi, et al., “A dynamically microprogrammable com-
`puter with low-level parallelism”, © 1980 IEEE, Publ. No. 0018-
`9340/80/07000-0577, pp. 577-594.
`Hartenstein, R. W., et al. “A General Approach in System Design
`Integrating Reconfigurable Accelerators,” http://xputers.informatik.
`uni-kl.de/papers/paper026-1.htrnl, IEEE 1996 Conference, Austin,
`TX, Oct. 9-11, 1996.
`Hartenstein, Reiner, et al., “A reconfigurable data-driven ALU for
`Xputers”, © 1994 IEEE, Pub1.No. 0-8186-5490-2/94, pp. 139-146.
`Hauser, John, et a1.: “GARP: a MIPS processor with a reconfigurable
`co-processor”, © 1997 IEEE, Publ. No. 0-08186-8159-4/97, pp.
`12-21.
`“A microprocessor-based hypercube,
`al.,
`et
`John,
`Hayes,
`supercomputer”, © 1986 IEEE, Pub1.No. 0272-1732/86/1000-0006,
`pp. 6-17.
`Herpel, H. -J., et al., “A Reconfigurable Computer for Embedded
`Control Applications”, © 1993 IEEE, Publ. No. 0-8186-3890-7/93,
`pp. 1 11-120.
`Hogl, H., et a1., “Enab1e++: A second generation FPGA processor”,
`© 19951EEE,Pub1.No.0-8186-7086-X/95,pp.45-53.
`King, William, et al., “Using MORRPH in an industrial machine
`vision system”. © 1996 IEEE, Publ. No. 08186-7548-9/96, pp.
`18-26.
`Manohar, Swaminathan, et al., “A pragmatic approach to systolic
`design”, © 1988 IEEE, Publ. No. CH2603-9/88/0000/0463, pp. 463-
`472.
`Mauduit, Nicolas, et a1., “Lneuro 1.0: apiece of hardware LEGO for
`building neural network systems,” © 1992 IEEE, Publ. No. 1045-
`9227/92, pp. 414-422.
`Mirsky, Ethan A., “Coarse-Grain Reconfigurable Computing”, Mas-
`sachusetts Institute of Technology, Jun. 1996.
`Mirsky, Ethan, et a1., “Matrix: A Reconfigurable Computing Archi-
`tecture with Configurable Instruction Distribution and Deployable
`Resources”, © 1996 IEEE, Publ. No. 0-8186-7548-9/96, pp. 157-
`166.
`Morley, Robert E., Jr., et al., “A Massively Parallel Systolic Array
`Processor System”,© 1988 IEEE, Publ. No. CH2603-9/88/0000/
`0217, pp. 217-225.
`Patterson, David, et al., “A case for Intelligent DRAM: IRAM”, Hot
`Chips VIII, Aug. 19-20, 1996, pp. 75-94.
`
`Peterson, Janes, et a1., “Scheduling and partitloning ANSI-C pro-
`grams onto multi-FPGA CCM architectures”, © 1996 IEEE, Publ.
`No. 0-8186-7548-9/96, pp. 178-187.
`Schmit, Herman, “Incremental reconfiguration for pipelined appli-
`cations,” © 1997 IEEE, Publ. No. 0-8186-8159-4/97, pp. 47-55.
`Sitkoff, Nathan, et al., “Implementing a Genetic Algorithm on a
`Parallel Custom Computing Machine”, Publ. No. 0-8186-7086-X/
`95, pp. 180-187.
`Stone, Harold, “A logic-in-memory computer”, © 1970 IEEE, IEEE
`Transactions on Computers, pp. 73-78, Jan. 1990.
`Tangen, Uwe, et al., “A parallel hardware evolvable computer
`POLYP extended abstract”, © 1997 IEEE, Publ. No. 0-8186-8159/
`4/97, pp. 238-239.
`Thornburg, Mike, et a1., “Transformable Computers”, © 1994 IEEE,
`Pub1.No. 0-8186-5602-6/94, pp. 674-679.
`Tomita, Shinji, et al., “A computer low-level parallelism QA-2”, ©
`1986 IEEE, Publ. No. 0-0384-7495/86/0000/0280, pp. 280-289.
`Trimberger, Steve, et al., “A time-multiplexed FPGA”, © 1997 IEEE,
`Publ. No. 0-8186-8159-4/97, pp. 22-28.
`Ueda, Hirotada, et al., “A multiprocessor system utilizing enhanced
`DSP’s for Image processing”, © 1988 IEEE, Publ. No. CH2603-9/
`88/0000/0611, pp. 611-620.
`Villasenor, John, et a1., “Configurable computing”, © 1997 Scientific
`American, Jun. 1997.
`Wang, Quiang, et a1., “Automated field-programmable compute
`accelerator design using partial evaluation”, © 1997 IEEE, Publ. No.
`0-8186-8159-4/97, pp. 145-154.
`W.H. Manglone-Smith and BL. Hutchings. Configurable comput-
`ing: The Road Ahead. In Proceedings of the Reconfigurable Archi-
`tectures Workshop (RAW’97), pp. 81-96, 1997.
`Wirth1in, Michael, et al., “The Nano processor: a low resource
`reconfigurable processor”, © 1994 IEEE, Publ. No. 0-8186-5490-2/
`94, pp. 23-30.
`Wirth1in, Michael, et al., “A dynamic instruction set computer”, ©
`1995 IEEE, Publ. No. 0-8186-7086-X/95, pp. 99-107.
`Wittig, Ralph, et
`a1., “One Chip: An FPGA processor with
`reconfigurable logic”, © 1996 IEEE, Publ. No. 0-8186-7548-9/96,
`pp. 126-135.
`Yamauchi, Tsukasa, et a1., “SOP: Areconfigurable massively parallel
`system and its control-data flow based compiling method”, © 1996
`IEEE, Pub1.No. 0-8186-7548-9/96, pp. 148-156.
`“Information Brief”, PCI Bus Technology, © IBM Personal Com-
`puter Company, 1997, pp. 1-3.
`Yun, Hyun-Kyu and Silverman, H. F.; “A distributed memory MIMD
`multi-computer with reconfigurable custom computing capabilities”,
`Brown University, Dec. 10-13, 1997, pp. 7-13.
`Hoover, Chris and Hart, David; “San Diego Supercomputer Center,
`Timelogic and Sun Validate Ultra-Fast Hidden Markov Model Analy-
`sis-One DeCypher-accelerated Sun Fire 6800 beats 2,600 CPUs run-
`ning Linux-”, San Diego Supercomputer Center, http://www.sdsc.
`edu/Press/02/050802imarkovmode1.htm1, May 8, 2002, pp. 1-3.
`Caliga, David and Barker, David Peter, “Delivering Acceleration:
`The Potential for Increased HPC Application Performance Using
`Reconfigurable Logic”, SRC Computers, Inc., Nov. 2001, pp. 20.
`Hammes, J.P., Rinker, R. E., McClure, D. M., Bohm, A. P. W., Najjar,
`W. A, “The SA-C Compiler Dataflow Description”, Colorado State
`University, Jun. 21, 2001, pp. 1-25.
`Callahan, Timothy J. and Wawrzynek, John, “Adapting Software
`Pipelining for Reconfigurable Computing”, University of California
`at Berkeley, Nov. 17-19, 2000, pp. 8.
`Ratha, Nalini K., Jain, Anil K. and Rover, Diane T., “An FPGA-based
`Point Pattern Matching Processor with Application to Fingerprint
`Matching”, Michigan State University, Department ofComputer Sci-
`ence, pp. 8.
`Dehon, André, “Comparing Computing Machines”, University of
`California at Berkeley, Proceedings of SPIE vol. 3526, Nov. 2-3,
`1998, pp. 11.
`Vemuri, Ranga R. and Harr, Randolph E., “Configurable Computing:
`Technology and Applications”, University of Cincinnati and
`Synopsys Inc., IEEE, Apr. 2000, pp. 39-40.
`Demon, Andre, “The Density Advantage of Configurable Comput-
`ing”, California Institute of Technology, IEEE, Apr. 2000. pp. 41-49.
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 3
`
`Petitioner Microsoft Corporation - Ex. 1005 , p. 3
`
`
`
`US 7,620,800 B2
`Page 4
`
`Haynes, Simon D., Stone, John, Cheung, PeterY.K. and Luk, Wayne,
`“Video Image Processing with the Sonic Architecture”, Sony Broad-
`cast & Professional Europe, Imperial College, University of London,
`IEEE, Apr. 2000, pp. 50-57.
`Platzner , Marco, “Reconfigurable Accelerators for Combinatorial
`Problems”, Swiss Federal Institute of Technology (ETH) Zurich,
`IEEE, Apr. 2000, pp. 58-60.
`Callahan, Timothy J., Hauser, John R. and Wawrzynek, John, “The
`Garp Architecture and C Compiler”, University of California, Ber-
`keley, IEEE, Apr. 2000. pp. 62-69.
`Goldstein, Seth Copen, Schmit, Herman, Budiu , Mihai, Cadambi,
`Srihari, Moe, Matt
`and Taylor, R. Reed,
`“PipeRench: A
`Reconfigurable Architecture and Compiler”, Carnegie Mellon Uni-
`versity, IEEE, Apr. 2000, pp. 70-76.
`Muchnick, Steven S., “Advanced Compiler Design and Implementa-
`tion”, Morgan Kaufmann Publishers, pp. 217.
`SA-C to
`Hammes,
`Jeffrey
`P., Dissertation
`“Compiling
`Reconfigurable Computing Systems”, Colorado State University,
`Department of Computer Science, Summer 2000, pp. 179.
`
`Automatic Target Recognition, Colorado State University & USAF,
`http://www.cs.colostate.edu/cameron/applications.html, pp. 1-3.
`Chodowiec, Pawel, Khuon, Po, Gaj, Kris, Fast Implementations of
`Secret-Key Block Ciphers Using Mixed Inner- and Outer-Round
`Pipelining, George Mason University, Feb. 11-13, 2001, pp. 9.
`Hastie, Neil, et al., “The Implementation of Hardware Subroutines on
`Field Programmable Gate Arrays”, XPO 1 000 5485, Plessey Semicon-
`ductors, Tamerton Rd., Plymouth, Devon, England, IEEE, May 13,
`1990, Custom Integrated Circuits Conference, pp. 314. 1-4. *the
`whole document*.
`Harbaum, Till, et al., “Design of a Flexible Coprocessor Unit”, Insti-
`tute
`of Operating
`Systems
`and Computer Networks,
`XP000879556TU Braunschweig, Germany, Proceedings of the
`Euromicro Conference, Sep. 1999, pp. 335-342. *whole document*.
`Mathias P C; Patnaik L M: “Systolic Evaluation of Polynomial
`Expressions,” IEEE Transactions on Computers, vol. 39, No. 5, May
`1, 1990, pp. 653-665, XP000116659.
`
`* cited by examiner
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 4
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 4
`
`
`
`U.S. Patent
`
`Nov. 17, 2009
`
`Sheet 1 of 20
`
`US 7,620,800 B2
`
`tq3.2mFat1a.
`
`m:
`
`m
`
`mlIll.all2v?~22:2%02::0::2::as:8v?
`
`
`
`mam3%onEzwormammewpzoEcoo?
`
`
`
`>m0§w§>IOEME
`
`mm._._0m._.zoomwjomkzoo
`
`o:oz<E0555.0:oz<E055.
`
`
`
`
`
`2m?mooEmIEoéomormogmszoé
`
`2ozo_,vrrwooimoumkz.mowmooamémfiz.owowo
`
`
`
`man—Em.MODEm
`
`
`
`$530$530
`
`mDI
`
`feta/m52$»sz
`
`Oz_mm._.m3._0
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 5
`
`Petitioner Microsoft Corporation - EX. 1005, p. 5
`
`
`
`
`U.S. Patent
`
`Nov. 17, 2009
`
`Sheet 2 of 20
`
`US 7,620,800 B2
`
`
`
`N5...“..a
`
`FUWZZOUKE—s.
`
`4420:.an
`
`dawn“,
`
`max—LU
`
`9N
`
`wow
`
`Now00.00.00.0.
`e906mom000.00.0Q
`
`CC0.0.0.0.9333”
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 6
`
`Petitioner Microsoft Corporation - EX. 1005, p. 6
`
`
`
`U.S. Patent
`
`Nov. 17, 2009
`
`Sheet 3 of 20
`
`US 7,620,800 B2
`
`n32
`
`ZO_F<0_.EQ<
`
`ifiasxxom
`
`w02<§m0umwa
`
`ommfimoz.,
`
`
`
`..Esmfigm
`
`howumwa
`
`\\
`
`
`
`\C:.m340w
`
`ZO_H<U_..QQ<\
`
`lNBWBAOHdWI BONVWHOAHEId
`
`mm.9...
`
`
`
`mmmemUOma.oZ
`
`.53QOEQ5at
`
`.l‘
`
`1‘
`
`Esmfigw
`
`zo:<ozmn.<\x
`
`Powumwm
`
`t_._.m<4<ow
`
`iNEWBAOHdWI BONVWHOJBBd
`
`mewwaOmmdz
`Petitioner Microsoft Corporation - Ex. 1005, p. 7
`
`Petitioner Microsoft Corporation - EX. 1005, p. 7
`
`
`
`
`
`U.S. Patent
`
`Nov. 17, 2009
`
`Sheet 4 of 20
`
`US 7,620,800 B2
`
`EmuxEEanas>3502052925
`
`20mxm0>><@004-
`
`20wxm0>>mQOOJ.wZO_mzmS=D
`
`_.wm<Ia
`
`02052925
`
`20mme>>¢QOOJ-Nwmdfn.
`
`20mxmogm£00..-N20.mzw§_o
`
`ca—uc-oq -----Iucané-gocuo-uuonoooooouu---—QQ..U--—-——-.--.-—-¢-u
`
`IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
`
`<Q00!—
`
`mmoot.
`
`3%
`
`llllllllllllllllllllllllllllllllllllll
`
`t<3.3m5at
`
`OOV
`
`..w>_.r0<z_..<£00...Nww<Im
`
`.m>.._.0<=m100'.-
`
`IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
`maoooWX;m
`Petitioner Microsoft Corporation - Ex. 1005, p. 8
`
`Petitioner Microsoft Corporation - EX. 1005, p. 8
`
`..w>_._.0<2__.mn50.“.
`
`:m>..ro<u4n00.—-Fww<Ia
`<aooomm:mIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
`
`
`
`US. Patent
`
`Nov. 17, 2009
`
`Sheet 5 0f 20
`
`US 7,620,800 B2
`
`IMAGE O,—
`
`I!)
`
`502
`
`
`
`MAP
`
`STEPBd
`
`ROUTINE
`
`
`
`506
`
`508
`
`RECEIVER
`
`VELOCITY
`
`SOURCE
`
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 9
`
`Petitioner Microsoft Corporation - EX. 1005, p. 9
`
`
`
`U.S. Patent
`
`Nov. 17, 2009
`
`Sheet 6 of 20
`
`US 7,620,800 B2
`
`
`
`+u..._mEs255%;..
`
`mm.9“.
`
`/
`
`
`
`wmm<raZO.H<PD&EOU
`
`mNmNwm
`
`I
`
`613532>1EE<§
`
`3&um
`
`mzfiDOm
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 10
`
`Petitioner Microsoft Corporation - EX. 1005, p. 10
`
`
`
`U.S. Patent
`
`9
`
`US 7,620,800 B2
`
`AENV>D<WK
`
`
`
`maz2,Aa”:ummmmM9mW>t004w>n>
`
`
`
`Dam—u.mmeuwmum
`
`
`
`ad:558%
`
`mmw
`
`
`
`ADNEAONW9mm
`
`
`
`mwofimIhawomw>On50;
`
`Em
`
`n—E
`
`«mm
`
`whOIwmw>OQOOJ
`
`x1522
`
`owm
`
`own
`
`mN:
`
`393.Mm7II:
`Iw>EESz+uEkfiz%I«3«mmxESE
`
`Nm
`
`v8m
`
`+u._aE<2
`
`-uEEsz
`
`v3
`
`omw
`
`C1]
`‘0
`
`.93
`LL
`
`3a
`
`«6.9”.
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 11
`
`Petitioner Microsoft Corporation - EX. 1005, p. 11
`
`
`
`
`
`
`
`U.S. Patent
`
`Nov. 17, 2009
`
`Sheet 8 of 20
`
`US 7,620,800 B2
`
`0 t
`
`o
`
`.23
`LL
`
` 618
`
`IMAGEz1,,
`IMAGEZ
`
` MAPTRId~
`READwzm)
`
`
` LOOPOVERDEPTHSLICES
`
`
`
`
`
`5(Zm).R(Zm)
`
`
`
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 12
`
`514
`
`522
`
`622
`
`52,
`
`'1'"
`
`I...
`
`O I C
`
`D
`
`}-—~
`[:23
`<1:
`t3
`
`2::
`Lori
`BEE
`35f
`mt!)
`mo
`>3:
`v03:
`%%
`
`9
`
`
`Petitioner Microsoft Corporation - EX. 1005, p. 12
`
`
`
`MAPSTEP1
`
`
`
`2B
`
`0.m8mi
`
`U.S. Patent
`
`0N
`
`7
`
`0
`
`7SU
`
`m,
`
`
`
` m5mmmmmN...N:.N:mANV>9mmANE.EmsSmo5
`
`
`
`HmmAouiiw9mm
`
`2,N8
`
`m30%Emma$5030..
`
`AmeN51wEfiw5N1meas).
`
`m
`
`WHOIwmw>0@001.
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 13
`
`Petitioner Microsoft Corporation - EX. 1005 , p. 13
`
`
`
`
`
`
`US. Patent
`
`Nov. 17, 2009
`
`Sheet 10 0f 20
`
`US 7,620,800 B2
`
`Fig.6E
`
` 82
`
` 614
`
`
` CONTINUE81AND82THROUGHCOMPUTE
`
`
`READV(Z,,,)
` MAPTRI_d-
`
`
`
`
`LOOPOVERDEPTHSLICES
`
`3(Zm).R(an)
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 14
`
`622
`
`
`
`LOOPOVERSHOTS
`
`624
`
`
`
`Petitioner Microsoft Corporation - EX. 1005, p. 14
`
`
`
`US. Patent
`
`Nov. 17, 2009
`
`Sheet 11 0120
`
`US 7,620,800 B2
`
`650
`
`FfiLGF
`
` 614
`READV(Z,,,) STEP4
` 8(Zm).R(Z,,,)
` LOOPOVERDEPTHSLICES LOOPOVER
`
`
`SHOTS
`
`
`MAPTRIx
`
`622
`
`624
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 15
`
`Petitioner Microsoft Corporation - EX. 1005, p. 15
`
`
`
`U.S. Patent
`
`US 7,620,800 B2
`
`mNSown5.
`
`
`U,mmwmwwo:mIhmm—D5:.“.04.2.mw>0NWDZ<Pm
`
`
`m0ZO_P<UOQONEom<>>2>>00NIPmnzfizou
`
`«599mwwoswEmma55moo;
`
`mnmfimso
`
`WFOImmm>OQOOJ
`
`5:;
`
`‘0
`
`93
`LL
`
`+u|_mE<s_
`
`flu
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 16
`
`Petitioner Microsoft Corporation - EX. 1005, p. 16
`
`
`
`
`U.S. Patent
`
`Nov. 17, 2009
`
`Sheet 13 of 20
`
`US 7,620,800 B2
`
`0:.
`
`v: NC.
`
`
`
`ME2.Owwamw34<>
`
`
`
`EOmEkaaEOU
`
`ijo02.5%:ng
`
`x:z.
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 17
`
`Petitioner Microsoft Corporation - EX. 1005, p. 17
`
`
`
`
`US. Patent
`
`Nov. 17, 2009
`
`Sheet 14 0f 20
`
`US 7,620,800 B2
`
`0 '
`
`fi
`.9
`U.
`
`TIMESET1
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 18
`
`Petitioner Microsoft Corporation - EX. 1005, p. 18
`
`
`
`U.S. Patent
`
`Nov. 17, 2009
`
`Sheet 15 of 20
`
`US 7,620,800 B2
`
`Nhwwmgk
`
`P#wwmeat.
`
`
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 19
`
`Petitioner Microsoft Corporation - EX. 1005, p. 19
`
`
`
`
`US. Patent
`
`Nov. 17, 2009
`
`Sheet 16 0f 20
`
`US 7,620,800 B2
`
`
`
`
`
`
`
`
`
`sflflflHHEEEEEEQ
`
`Iflflflflflflflflflflfl
`50900952
`
`aw
`
`=-INFINITY
`
`gain...‘u-ggg_§.=
`
`
`
`Eflflflflflflli-fiufllfifi
`annuallaannnllaa
`
`§
`9;,
`”‘
`
`
`E HE
`
`,
`{3/ EIEHHEIII HE
`I °°°
`Petitioner Microsoft Corporation - Ex. 1005, p. 20
`
`w E
`
`-#-
`
`
`Petitioner Microsoft Corporation - EX. 1005, p. 20
`
`
`
`U.S. Patent
`
`Nov. 17, 2009
`
`Sheet 17 of 20
`
`US 7,620,800 B2
`
`whaafioomi»2.waDww34<>
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 21
`
`Petitioner Microsoft Corporation - EX. 1005, p. 21
`
`
`
`
`U.S. Patent
`
`Nov. 17, 2009
`
`Sheet 18 of 20
`
`US 7,620,800 B2
`
`.mek
`
`D-_.co-
`
`65k
`
`All!
`
`g
`
`6er
`
`agag25...:
`
`ms...—
`
`anon.9...aj«33u:taxm.5m:25:8E
`uramxm.#3»smoumwa
`
`\.03
`
`~33
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 22
`
`Petitioner Microsoft Corporation - EX. 1005, p. 22
`
`
`
`
`
`
`
`
`US. Patent
`
`Nov. 17, 2009
`
`Sheet 19 0f 20
`
`US 7,620,800 B2
`
`5:
`
`.93
`u.
`
`§
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 23
`
`Petitioner Microsoft Corporation - EX. 1005, p. 23
`
`
`
`US. Patent
`
`Nov. 17 2009
`
`Sheet 20 onO
`
`US 7,620,800 B2
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 24
`
`
`
`US 7,620,800 B2
`
`1
`MULTI-ADAPTIVE PROCESSING SYSTEMS
`AND TECHNIQUES FOR ENHANCING
`PARALLELISM AND PERFORMANCE OF
`COMPUTATIONAL FUNCTIONS
`
`CROSS REFERENCE TO RELATED PATENT
`APPLICATIONS
`
`The present application is a Continuation of US. patent
`application Ser. No. 10/285,318 filed Oct. 31, 2002 which is
`related to the subject matter of US. patent application Ser.
`No. 09/755,744 filed Jan. 5, 2001 for: “Multiprocessor Com-
`puterArchitecture Incorporating a Plurality ofMemory Algo-
`rithm Processors in the Memory Subsystem” and is further
`related to the subject matter of US. Pat. No. 6,434,687 for:
`“System and Method for Accelerating Web Site Access and
`Processing Utilizing a Computer System Incorporating
`Reconfigurable Processors Operating Under a Single Oper-
`ating System Image”, all of which are assigned to SRC Com-
`puters, Inc., Colorado Springs, Colo. and the disclosures of
`which are herein specifically incorporated in their entirety by
`this reference.
`
`COPYRIGHT NOTICE/PERMISSION
`
`A portion of the disclosure of this patent document may
`contain material which is subject to copyright protection. The
`copyright owner has no objection to the facsimile reproduc-
`tion by anyone ofthe patent document or the patent disclosure
`as it appears in the United States Patent and Trademark Office
`patent file or records, but otherwise, reserves all copyright
`rights whatsoever. The following notice applies to the soft-
`ware and data and described below, inclusive of the drawing
`figures where applicable: Copyright © 2000, SRC Comput-
`ers, Inc.
`
`BACKGROUND OF THE INVENTION
`
`The present invention relates, in general, to the field of
`computing systems and techniques. More particularly, the
`present invention relates to multi-adaptive processing sys-
`tems and techniques for enhancing parallelism and perfor-
`mance of computational functions.
`Currently, most large software applications achieve high
`performance operation through the use ofparallel processing.
`This technique allows multiple processors to work simulta-
`neously on the same problem to achieve a solution in a frac-
`tion of the time required for a single processor to accomplish
`the same result. The processors in use may be performing
`many copies of the same operation, or may be performing
`totally different operations, but in either case all processors
`are working simultaneously.
`The use of such parallel processing has led to the prolif-
`eration of both multi-processor boards and large scale clus-
`tered systems. However, as more and more performance is
`required, so is more parallelism, resulting in ever larger sys-
`tems. Clusters exist today that have tens of thousands of
`processors and can occupy football fields of space. Systems
`of such a large physical size present many obvious down-
`sides, including, among other factors, facility requirements,
`power, heat generation and reliability.
`
`SUMMARY OF THE INVENTION
`
`However, if a processor technology could be employed that
`offers orders of magnitude more parallelism per processor,
`these systems could be reduced in size by a comparable
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`
`factor. Such a processor or processing element is possible
`through the use of a reconfigurable processor. Reconfigurable
`processors instantiate only the functional units needed to
`solve a particular application, and as a result, have available
`space to instantiate as many functional units as may be
`required to solve the problem up to the total capacity of the
`integrated circuit chips they employ.
`At present, reconfigurable processors, such as multi-adap-
`tive processor elements (MAPTM, a trademark of SRC Com-
`puters, Inc.) can achieve two to three orders of magnitude
`more parallelism and performance than state-of-the-art
`microprocessors. Through the advantageous application of
`adaptive processing techniques as disclosed herein, this type
`ofreconfigurable processing parallelism may be employed in
`a variety of applications resulting in significantly higher per-
`formance than that which can now be achieved while using
`significantly smaller and less expensive computer systems.
`However, in addition to these benefits, there is an addi-
`tional much less obvious one that can have even greater
`impact on certain applications and has only become available
`with the advent of multi-million gate reconfigurable chips.
`Performance gains are also realized by reconfigurable pro-
`cessors due to the much tighter coupling of the parallel func-
`tional units within each chip than can be accomplished in a
`microprocessor based computing system.
`In a multi-processor, microprocessor-based system, each
`processor is allocated but a relatively small portion ofthe total
`problem called a cell. However, to solve the total problem,
`results of one processor are often required by many adjacent
`cells because their cells interact at the boundary and upwards
`of six or more cells, all having to interact to compute results,
`would not be uncommon. Consequently, intermediate results
`must be passed around the system in order to complete the
`computation of the total problem. This, of necessity, involves
`numerous other chips and busses that run at much slower
`speeds than the microprocessor thus resulting in system per-
`formance often many orders ofmagnitude lower than the raw
`computation time.
`On the other hand, in the use of an adaptive processor-
`based system, since ten to one thousand times more compu-
`tations can be performed within a single chip, any boundary
`data that is shared between these functional units need never
`
`leave a single integrated circuit chip. Therefore, data moving
`around the system, and its impact on reducing overall system
`performance, can also be reduced by two or three orders of
`magnitude. This will allow both significant improvements in
`performance in certain applications as well as enabling cer-
`tain applications to be performed in a practical timeframe that
`could not previously be accomplished.
`Particularly disclosed herein is a method for data process-
`ing in a reconfigurable computing system comprising a plu-
`rality of functional units. The method comprises: defining a
`calculation for the reconfigurable computing system; instan-
`tiating at least two of the functional units to perform the
`calculation; utilizing a first of the functional units to operate
`upon a subsequent data dimension of the calculation and
`substantially concurrently utilizing a second ofthe functional
`units to operate upon a previous data dimension of the calcu-
`lation.
`
`Further disclosed herein is a method for data processing in
`a reconfigurable computing system comprising a plurality of
`functional units. The method comprises: defining a first sys-
`tolic wall comprising rows of cells forming a subset of the
`plurality of functional units; computing a value at each of the
`cells in at least a first row of the first systolic wall; commu-
`nicating the values between cells in the first row of the cells to
`produce updated values; communicating the updated values
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 25
`
`Petitioner Microsoft Corporation