`US 7,225,324 32
`(10) Patent No.:
`
`Huppenthal et a1.
`(45) Date of Patent:
`May 29, 2007
`
`USOO7225324BZ
`
`(54) MULTI-ADAPTIVE PROCESSING SYSTEMS
`AND TECHNIQUES FOR ENHANCING
`PARALLELISM AND PERFORMANCE 011‘
`COMPUTATIONAL FUNCTIONS
`
`(75)
`
`Inventors: Jon M. Huppenthal. Colorado Springs.
`.
`.
`C}O_(USJ:PaVId1 15- Callga, €010me
`firmness (-0 (U5)
`
`(73) Assignee? SRC COMPuK‘l‘Ss lno. Colorado
`Springs. CO (US)
`
`( “ ) Notice:
`
`Subject to any disclaimer. the term (illliis
`patent is extended or adjusted under 35
`U.S.C. 154(1)) by 550 days.
`
`(2” APP“ N0" ””8531”
`
`(22) Filed:
`
`OCt- 31-. 2002
`
`(65)
`
`Prior Publication Data
`
`U S 20(l4f()088527 Al
`
`May 6. 2004
`
`(51)
`
`Int. Cl.
`(2006.01)
`GQ6F 17/00
`............
`(52) U.S. Cl.
`(58)
`Field of Classlfieation search
`
`7121226
`712115,
`712119. 226. 215
`See application lilc for complete search history.
`
`
`
`(56)
`
`References Cited
`U.S. PATENT DOCUMENTS
`
`4.727.503 A
`4.8?2.133 A *
`4362-331 A *
`5‘0201059 A ‘k
`5‘072‘37l A
`5.230.05? A
`5.224.832 A *
`5.47l.62? A
`5.4??.221 A
`
`2.51988 McWhirler
`?U8.-"509
`10:"1989 Lecland
`342-13712
`[El-"1990 11¢"?i3, SR
`.
`5";199' G‘"“" e' a"
`I
`7'3" I
`”"9“ Helm” ‘5' ‘1"
`731993 Shldo 01 (ll.
`[2.-'1993 Khan ......................... 208.5424
`||-"I995 Means el al.
`1251995 Chang ct al.
`
`5.570.040 A
`5.640.586 A "
`5337366 A
`5.284.108 A *
`5.892.962 A
`
`lit-1996 Lylle e1 :11.
`6.51997 Peehanek el 3].
`4-1998 'I'an
`2.51998 Skalct'zky et al.
`4.51999 Cloulier
`
`212513
`
`325324015
`
`z'gfi'fié 2
`_.
`_.
`.
`5.956.518 A
`5,023,755 A
`6.052.273 A
`6.061.?06 A *
`6.076.152 A
`6.192.439 Bl
`6.215.898 Bl "‘
`5225375 3'
`6.289.440 Bl "
`6.385.257 Bl ’*
`
`7"12116
`212-15
`
`708x491
`
`3:33 if?“ m a].
`.-
`1
`Irsky ct al.
`91-1999 DeHon el al.
`2920110 Cass/chum
`4.32000 DL‘IIOI'I Ct 31.
`5:21.100 Gai ct at.
`6.-"2000 Huppenlhal et al.
`2.12001 Gnlncwald ct al.
`4:"2001 Woodfil] ct a].
`51"3091 Pafwhlll 8‘ 1“
`9.-"200l Casselman .................. 7123227
`5:"2002 Gupta (:1 a].
`7163']
`
`381-154
`
`OTHER PUBLICATIONS
`
`Rosenberg. J. M.. Dictionaryol‘Computers,lnfrlnnalion Processing
`&Teleeomm1u1icali0ns. 1984. John WileySzSons. 26d. pp. 4963a
`.
`(Contlnued)
`
`liric Coleman
`Primary i'imminer
`(74) Atfomqt: Agent. or Firm—William .l. Kubida: Michael
`C. Martensen; Hogan & Hartson LLP
`
`(57)
`
`ABMRM"
`.
`.
`.
`_
`_
`tor
`Multl—avdaptlve processmg systems and technlques-
`enhancing parallellsm and performance of computatlonal
`functions are disclosed which can be employed in a myriad
`of applications including mulli-dimensional pipeline com-
`putations for seismic applications. search algorithms. infor—
`mation security. chemical and biological applications.
`fil—
`tering and the like as well as
`for systolic wavefront
`computations for fluid flow and structures analysis. hioin-
`lormalics etc. Some applications may also employ both the
`multi—dimensional pipeline and systolic wavefront method—
`010 ies disclosed
`g
`.
`
`52 (flaims, 20 Drawing Sheets
`
`
`
`
`
`
`&3:3:.
`
`
`
`
`taoeeeo
`
`
`
`
`
`
`
`assess
`
`
`
`eases
`
`
`
`
`
`
`
`eases
`
`
`
`
`
`um
`0:09:00‘0
` e
`
`
`
`
`
`
`
`Petitioner Microsoft Corporation - Ex. 1001, p. 1
`Petitioner Microsoft Corporation - EX. 1001, p. 1
`
`
`
`US 7,225,324 B2
`
`Page 2
`
`0’11 ll-ER PU BIJCA'I‘IONS
`
`Agarwal. A.. et al.. “The Raw Compiler Project“. pp. 1-12. http:t'.-"
`cag—wacsmiLedufraw. Proceedings of the Second SUIF Com-
`piler Workshop. Aug. 21—23. [997.
`Albahama. Osarna. el al.. “On the viability of Fl’GA—based inte—
`grated coprocessors". (D 1996 lL-IL'L’. Publ. No. 0-8186-1" 5413-9-96.
`pp. 206—215.
`Amerson. Rick. el al.. "TeratnIIc—Configurable Custom Compul—
`ing". CJ 1995 IEEE. Publ. No. 0-8186-7086-20'95. pp. 32-38.
`Barthel. Dominique Aug. 25-26. 1997. “PVP a Parallel Video
`coProcessor". Hot Chips IX. pp. 203—2lfl.
`Berlin. Palriee. el al.. “Programmable active memories: 21 perfor—
`mance assessment“. LC 1993 Massachusetts Institute of Technology.
`pp. 88-102.
`Bitlner. Ray. et al.. “Computing kernels implemented with a worm—
`hole RTR CCM”. IQ 1997r 111115.13. Publ. No. 0-8186-81 59-41971. pp.
`98-105.
`Buell. 1).. et al. “Splash 2: FPGAs in a Custom Computing
`Machine—Chapter l—Custoln Computing Machines: An Introduc—
`tion". pp. 1-11. http:ItWW.co1np1Iter.orgt'espress.-"catalog’bp074131'
`spls-chl.html (originally believed published in J. of Supercomput-
`ing. vol. IX. 1995. pp. 2|9—230.
`Casselman. Steven. “Virtual Computing and The Virtual Com—
`puter". Q 1993 HEEL". Pub]. No. U-8186-389U-Tr‘93. pp. 43-48.
`Chan. Pak. et al.. “Architectural Uadeofis in field-programmable-
`lleviee—based computing systems”. if?) [993 IEEE. Publ. No. U—SlSfi—
`38904993. pp. 152-161.
`Clark. David. et al.. “Supporting FPGA microprocessors through
`Ietargetable soflware tools”.
`1‘ I996 IEF. E. Pub]. No. 0—8186—7548—
`9-‘96. pp. 195—103.
`Cuccaro. Steven. et al.. wI'he CM-ZX: a hybrid (TM-2.1'Xilink pro-
`totype". Q 1993 IEEE. Publ. No. 0-8186-3890-73'93. pp. 121-130.
`Culbertson. W. Bruce. et 3].. “Exploring archileetures for volume
`visualization on the Teramac custom computer". 13:1 1996 IEEE. P1111.
`No. ill-818635483196. pp. 80-88.
`Culbertson. W. Bnlce. et al.. “Defect tolerance on the ’l‘eramac
`custom computer". if) 199-1I IFEE'. Publ. No. U—8]86—8159—4.-"97. pp.
`116—123.
`
`Dehon. Andre. “DPGA-Coupled microprocessors: commodity 1C
`for the early 21"t century". RC) 1994 lL-IEE. Pub]. No. 0-8186-5490-
`21'94. pp. 31—39.
`Dehon. A.. el al.. "MATRIX A Reconfigurable Compuling Device
`with Configurable Instruction Distribution". Hot Chips 1X. Aug.
`25-26. 1997. Stanford. Califomia. MIT Anilicia] Intelligence Labo—
`Iatory.
`Dhaussy. Philippe. et al.. "Global control synthesis for an MlMDx'
`IiPGA machine". 9 1994 11313.". Publ. No. 0-8186-5490-21'94. pp.
`72—81.
`
`Elliott. Duncan. el al.. “Computational Ram: a memory—51MB
`hybrid and its application to DSP“.
`RC) 1992 HEEL". Publ. No.
`0-7803-0246-X."92. pp. 30.13.130.64.
`Fortes. Jose. et a].. "Syslolic arrays. a survey of seven projects”. 13?)
`I987 IEEE. Publ. No. UUIS—9lfiZtSI’0700—Dfl9l. pp. 9l—lfl3.
`Gokhale. M.. et al.. “Processing in Memory: '1'he'1'erasys Massively
`Parallel PlM Array“ C) Apr. 1995. 11311213. pp. 23-31.
`Gunther. Bernard. et al.. “Assessing Document Relevance with
`Run—Time Reconfigurable Machines".
`(Q 1.9.96 IEEE. Publ. No.
`0-8186-?548-9.-'96. pp. 10-17.
`1-Iagiwara.
`IIiroshi. et al.. “A dynamically microprogrammablc
`compuler with low—level parallelism”. ”Q 1980 IFEE. Publ. No.
`0018-934U.-'SUI'U?OUU-USTT. pp. 5717-5941.
`11artenstein. R. W.. et al. “A General Approach in System Design
`Inlegrating Reconfigurable Accelerators.“ hl1p:.-"1’xpu[ers.infonnalik.
`tIni—kl.de.n’1:|apersfpaper026—l.hlll'll. IEEE 1996 Conference. Austin.
`TX. Oct. 9-11. 1996.
`Ilartenstein. Reiner. et al.. “A reconfigurable data-driven ALU for
`Xpulers”. {(-3 [994 IEEE. Pub]. No. Ill—818654904194. pp. [39—146.
`Hauser.
`John.
`e1
`211.:
`“GARP:
`a MIPS processor with a
`reconfigurable co-processor“. RC) 1997 IEEE. Publ. No. 11-08186-
`8159-4197. pp. 12-21.
`
`hypercube.
`“A microp1occssor-based
`al..
`et
`John.
`IIayes.
`supercomputer”. C1 1986 IEEE. Publ. No. UZTZ—ITBZISfiIIUOO-
`0006. pp. 6-17.
`I-Ierpel. 1-1.-.I.. et al.. “A Reconfigurable Computer for Embedded
`Control Applications”. 17'} I993 IEFF. Publ. No. 041 1116—3890—1593.
`pp. Ill-120.
`110g]. 11.. ct al.. “EnableH: A second generation I-‘PGA processor".
`Q 1995 IEEE. Pub]. No. 9-8186-7086-X1'95. pp. 45-53.
`King. William. e1 al.. “Using MORRPH in an industrial machine
`vision system”.
`[‘3‘ 1996 IEEE. Publ. No.
`flSlSfi—TS48—9.-’9fi. pp.
`18-26.
`
`Manohar. Swaminalhan. et al.. “A pragmatic approach to systolic
`design”.
`if?) 1988 IEEF. Pub]. No. C.H2603—9188'0000.-"U463. pp.
`463-472.
`Mauduit. Nicolas. et al.. "Lneuro 1.0: a piece of hardware LEGO for
`building neural network systems.” (I?) [992 IEEE. Publ. No.
`[045—
`9221'192. pp. 414422.
`Mirsky. Ethan A.. “Coarse-Grain Reconfigurable Computing". Mas-
`sachusetts Institute of Technology. Jun. 1996.
`Mirsky. ELhan. el al.. “MATRIX: A Reconfigurable Computing
`Architecture with Configurable
`Instnlction Distribution and
`Deployable Resources“. 1:? 1996 1131.513. Publ. No. 0-8186-7548-9."
`96. pp. [ST—[66.
`.Ir.. el al.. “A Massively Parallel Systolic Array
`Morley. Robert F...
`Processor System“. 9:3 1988 [13131:]. Publ. No. C112603-9.-"88.-"0000t
`021?. pp. 21?-225.
`Patterson. David. at al.. “A case for intelligent DRAM: IRAM". Hot
`Chips VIII. Aug. 19—20. 1996. pp. 7594.
`Peterson. Janes. et al.. “Scheduling and partitioning ANSI-C pro-
`grams onto rnulti—FPGA CCM architectures". (1‘) 1996 IEEE. Publ.
`No. 0—8186—7548-91’96. pp.
`ITS—18?.
`Schmit. IIerman. "Incremental reconfiguration for pipelined appli-
`cations." tCJ 199'}r IEEE. Publ. No. 0-8186-8159-4."97. pp. 47-55.
`Silkofl'. Nathan. e1 al.. “Implementing a Genetic Algorithm on a
`Parallel Custom Computing Machine”. Publ. No. U—SlSfi—TOSfi—Xi
`95. pp. 190-187.
`Stone. Ilarold. “Alogic-in-memory computer“. 8;? 1970 IEEE. 1131-31.".
`Transactions on Computers. pp. 73—78. Jan. 1990.
`Tangen. UWe. et al.. “A parallel hardware evolvahle computer
`POLYP extended abstract". (0 199? IEEE. Pub]. No. 0-8186-81591‘
`497. pp. 238—239.
`Thornburg. Mike. el al.. “Transformable Colnpulers". £0 [994 IEEE.
`Publ. No. 0-8185-5602-61’94. pp. 674-6119.
`'1‘omita. Shinj i. et al.. “A computer low-level parallelism QA-Z". LC
`1986 IEEE. Pub]. No. Ll—Il384—7495a"86.-"000010280, pp. 230—289.
`Trimberger. Steve. el al.. “A lime—multiplexed FPGA”.
`'39
`[997
`IEEE. Pub]. No. 0-8186-8159-4.-"97. pp. 22-28.
`Ueda. IIirotada. et al.. "A multiprocessor system utilizing enhanced
`DSP’s for image processing”. f) 1988 IEEE. Publ. No. CH2603—9.-'
`Fifi-"ODUUFOISII. pp. 611—620.
`Villasenor. John. et al.. "Configurable computing". © 199'."Ir Scien-
`tific American. Jun. 1997.
`Wang. Quiartg. el al.. "Automated field—programmable compute
`accelerator design using partial evaluation”. 11‘ 199']:r IEEE. Publ.
`No. 0-8186-8159-41'97. pp. 145-154.
`W11. Manglone-Smith and B.L. IIutchings. Configurable comput-
`ing: The Road Ahead. In Proceedings ofthe Reconfigurable Archi—
`tectures Workshop (RAW’97). pp. 81-96. 1997.
`Wirthlin. Michael. et al.. “The Nano processor: a low resource
`reconfigurable processor”. 1?) 1994 IEEE. Publ. No. 0—8186—5490—
`2-"94. pp. 23—30.
`Wirthlin. Michael. et al.. “A dynamic instruction set computer". to
`1995 IEEE. Pub]. No. 0-8186-7086-20'95. pp. 99-101
`Wittig. Ralph. el a].. “One Chip: An FPGA processor with
`reconfigurable logic”. to 1996 IEEE. Publ. 0—8186—7548—9..-'96. pp.
`126-135.
`Yamauchi. ’I'sukasa. ct al.. “SUP: A reconfiglu‘able massively par-
`allel syslem and its conlrol—dala flow based compiling method". ®
`1996 IF.F.E. Pub]. No. 0—8186—7548—9196. pp. 148—156.
`“Information Brief". PCI Bus Technology. CI IBM Personal Com-
`puter Company. 199?. pp. 1-3.
`
`Petitioner Microsoft Corporation - Ex. 1001, p. 2
`Petitioner Microsoft Corporation - EX. 1001, p. 2
`
`
`
`US 7,225,324 B2
`
`Page 3
`
`[-1. 1'1: “A distributed memory
`Yun. IIyun-Kyu and Silverman.
`MIMI) multi—compuler with reconfigurable custom computing
`capabilities”. Brown University. Dec. 10-13. 199?. pp. "rt-l3.
`IIoover. Chris and Hart. David; "San Diego Supercomputer Center.
`Tilnelogic and Sttn Validale Ultra—Fast Hidden Markovr Model
`Analysis—One DeCypher—accelerated Sun Fire 6800 heals 2,600
`CPUs running Linux-". San Diego Supercomputer Center. http:.-".r'
`wwwsdsc.odu.-"Pressr’02«"050802 markovmodelhtml. May 8. 2002.
`pp. 1—3.
`Caliga. David and Barker. David Peter. “Delivering Acceleration:
`The Potential for Increased lIPC Application Performance Using
`Reconfigurable Logic”. SRC Computers. Inc.. Nov. 200]. pp. 20.
`Hammes. J.P.. Rinker. R. F..; McClure. D. M.. 361nm. A. P. W..
`Najjar. W. A.._ “The SA-C Compiler Dataflow Description”. Colo-
`rado State University. Jun. 21. 2001. pp. 1—25.
`Callahan. Timothy .I. and Wawrzynek. John. “Adapting Software
`Pipelining for Reconfigrrable Computing”. University of California
`at Berkeley. Nov. I7-l9. 2000. pp. 8.
`Ralha, Nalini K.. Jain. Anil K. and Raver. Diane T.. “An FPGA—
`based Point Pattem Matching Processor with Application to Fin—
`gerprint Matching”. Michigan State University. Department of
`Computer Science. pp. 8.
`Dehon. Andre. “Comparing Computing Machines”. University of
`California at Berkeley. Proceedings of SPIE vol. 3526. Nov. 2-3.
`1998. pp. 11.
`Vemuri. Ranga R. and Harr. Randolph F... “Configurable Comput—
`ing: Technology and Applications“. University of Cincinnati and
`Synopsys Inc.. IEL-‘E. Apr. 2000. pp. 39-40.
`Dehon. Andre. “The Densin Advantage of Configurable Comput—
`ing”. Calilbrnia Inst ilule ol'Tecl'mology. IEEE. Apr. 2000. pp. 4 [-49.
`Ilaynes, Simon 0.. Stone. John. Cheung. Peter YK. and Luk.
`Wayne. “Video Image Processing with the Sonic rmhitecnlre“.
`Sony Broadcast & Professional Europe. Imperial College. Univer—
`sity of London. IEEE. Apr. 2000. pp. 50-52.
`Platzner . Marco. “Reconfigurable Accelerators for Combinatorial
`Problems”. Swiss Federal Institute 01' Technology (ETH) Zurich.
`IEEE. Apr. 2000. pp. 58-60.
`Callahan. Timothy J.. Ilauser. John R. And Wawrzynek. John. “I‘he
`Garp Architecture and C Compiler”. University of California.
`Berkeley. IEEE. April 2000. pp. 62—69.
`Goldstein. Seth C open. Schmit. Ilerman. Budiu . Mihai. Cadambi.
`Srihari. Moe. Matt
`and Taylor. R. Reed.
`“PipeRench: A
`Reconfigurable Architecture and Compiler". Carnegie Mellon Uni—
`versity. ”EFF1 Apr. 2000. pp. 70—76.
`Muchnick. Steven 8.. “Advanced Compiler Desig and Implemen-
`tation”. Morgan Kaufmann Publishers1 pp. 217.
`
`SA-C "l‘o
`“Compiling
`P.. Dissertation
`Jeffrey
`IIammes.
`Reconfigurable Cornpuling Systems”. Colorado State University.
`Department of Computer Science. Summer 2000. pp. [29.
`Automatic Target Recognition. Colorado State University & USAl'.
`http:.-'fwww.cs.coloslale.edu-"carneron."applicationshtml. pp.
`l—3.
`C hodowiec. Pawel. Khuon. Po. Gaj. Kris. Fast Implementations of
`&cret—Key Block C iphers Using Mixed Inner— and Outer—Round
`Pi pelining. George Mason University. Feb. 11-13. 2001. pp. 9.
`Miyamori. 'I'akashi. “REMARC: Reconfigurable Multimedia Array
`Coprocessor". IEICF. Transactions on Information and Systems.
`Information at Systems Society. Tokyo. JP. vol. ESE-D. No. 2. Feb.
`1999. pp. 389-397. XPULKJ821922.
`Gross Thomas. et a] .. “Compilation for a High-performance Systolic
`Array”. Sigplan Notices USA. vol. 21. No. 7'. Jul. 1986. pp. 27—38.
`XP002418525.
`
`Rauchwerger. Lawrence. et at, “The LRPD Test: Speculative Run-
`Time Parallelization of Loops with Privatization and Reduction
`Parallelimlion". IEEE Transaction on Parallel and Dislribltted Sys—
`tems. IEEE Service Center. Los Alamitos. CA. vol. 10. No. 2. Feb.
`1999. pp. 160-180. XP000908318.
`Arnold JetIrey M. et a].. “The Splash 2 Processor and Applications”.
`Computer Design: VLSI in Computers and Processors. [993. ICC D
`‘93 Proceedings. 1993 IEEE International Conference on Cam-
`bridge. MA. Oct. 3-6 1993. Los Alamitos. CA. 11:2le Comput. Soc.
`Oct. 3. 1993. pp. 482—485. XP01013457L
`Hwang. Kai. “Computer Architecture and Parallel Processing”.
`Data Flow Computers and VLSI Computations.
`I985. McGraw
`Hill. Chapter 1'0. pp. 732-807. XP—002418055.
`Hanenstein. Reiner W.. et a]. “A Synthesis System for Bus—based
`Wavefront Array Architectures”. Proceedings. International Conl'er—
`ence on Application-Specific Systems. Architecture and Processors.
`1996. pp. 274-283. XP002132819.
`Alexander. Thomas. el al. “A Reconfigurable Approach To A
`Systolic Sorting Architecture”. ImAS 89. May 8. 1989. pp. 1178-
`1182. XP010084472.
`
`Wu. Youfeng. et at. ”Better Exploration of Region-Level Value
`Locality with Integrated Computalion Reuse and Value Prediction”.
`Proceedings 01' the 28th Inlemalional Symposium on Computer
`Architecture. ISCA 2001. Goteberg. Sweden. Jun. 30-Jul. 4. 2001.
`International Symposium on Computer Architecture, (ISCA). Los
`Alamitos. CA. IEEF. Comp. Soc. US. Jun. 30. 2001. pp. 93—103.
`XP010552866.
`
`* cited by examiner
`
`Petitioner Microsoft Corporation - Ex. 1001, p. 3
`Petitioner Microsoft Corporation - EX. 1001, p. 3
`
`
`
`US. Patent
`
`May 29, 2007
`
`Sheet ] of 20
`
`US 7,225,324 82
`
`t1BE5.9".a
`
`N:
`
`n2
`
`'HalaIIIX:«for23026.9“$25::Sq?3v?
`
`
`
`mamm_n__wFZONE
`
`mormammew50E
`
`2mgmoemszozu
`
`mmjoEzoo
`
`>mOEmE
`
`O:02.1
`
`>mOEmEmmjomkzou
`
`0:02¢
`
`Novmoemmeoé
`
`>m0§w§
`
`>mOEm2
`
`69
`
`MODEm
`
`O:05.
`
`nlllllllllillillllMODEm
`
`0:
`
`ZmooEmdmkz.mow
`20:o
`mwoamémhz.mow
`
`Do:
`
`mmEhDJO
`
`OzEm—HmDAO ”Gyms-.0
`
`mDI
`
`fotgwEzmmfixmv
`
`Petitioner Microsoft Corporation - Ex. 1001, p. 4
`Petitioner Microsoft Corporation - EX. 1001, p. 4
`
`
`
`
`
`US. Patent
`
`May 29, 2007
`
`Sheet 2 of 20
`
`US 7,225,324 82
`
`4420_._._n_o<
`
`m_>_.—.n_<n_<
`
`MOmmmoomm
`
`mn_=._U
`
`NON
`
`mow
`
`1.10mOmmmoomn.
`
`
`
`m>_._.n_<n_<
`
`N.mt
`
`ON
`
`FUMZZOOMMFE
`
`mom
`
`>w—OEmE
`
`Petitioner Microsoft Corporation - Ex. 1001, p. 5
`Petitioner Microsoft Corporation - EX. 1001, p. 5
`
`
`
`US. Patent
`
`May 29, 2007
`
`Sheet 3 of 20
`
`US 7,225,324 32
`
`a4:
`
`7,9203%?
`
`Egmfimom
`
`\\
`
`Dmm<wmoz_
`
`wozflszmmmm
`
`Homumma
`
`>.._._.__m(._<om
`
`ZO_._.<O_.E&<
`
`t_.__m_<l_<om
`
`\
`
`\
`
`L.‘
`
`mmat
`
`
`
`wmomwmoomn..oz
`
`
`
`wmowmwOOma.oz
`
`km.“KORE5.9".
`
`iNEIINEAOHcIWI EIONVWHOdHEd
`
`ZO_._.<O:n_n_<
`
`>._._.:m<._(0w
`
`.\l\
`
`\\
`
`III
`
`Fomuimm
`
`>._._.=m<._<0m
`
`iNEIWEIAOHcIWI EONVWHOAHEId
`
`Petitioner Microsoft Corporation - Ex. 1001, p. 6
`Petitioner Microsoft Corporation - EX. 1001, p. 6
`
`
`
`
`
`
`
`US. Patent
`
`U
`
`2B423.,5
`
`.,.__.m_>_5<__<109mrzoazmgam"Swain.y20mxmog<109-u<aOO._mmrmwdfufl.WN;mIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
`
`Emu>EEnuonmmEVm_
`
`7..__002052925"u-
`
`3v
`
`0mm2nuMr2052925"u4209E0;m.39-mmn_oo._m._m_>:.o<._m“69.wwzo_mzm§_omm__m_>_._.o<z_..<moo;.%20$5.03<non:-mmmmm<xaNmm<xa
`
`
`
`o|_‘.Wmov
`
`1,...
`n0tax~03".
`
`<n_OO._
`
`mn_OO._
`
`3w
`
`IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
`
`Petitioner Microsoft Corporation - Ex. 1001, p. 7
`Petitioner Microsoft Corporation - EX. 1001, p. 7
`
`
`
`US. Patent
`
`May 29, 2007
`
`Sheet 5 of 20
`
`US 7,225,324 B2
`
`«6.3
`
`a
`
`as).
`
`AMZPDOEM8&5.
`
`
`
`Petitioner Microsoft Corporation - Ex. 1001, p. 8
`Petitioner
`Microsoft Corporation - EX. 1001, p. 8
`
`
`
`
`US. Patent
`
`May 29, 2007
`
`Sheet 6 of 20
`
`US 7,225,324 32
`
`Fig.SB
`
`526
`
`2%
`
`MAPTRId-
`
`MAPTRId+
` COMPUTATIONPHASES 520
`
`MAPTRLX
`
`
`
`MfiPTRLy
`
` MAP
`ROUTINE
`STEPSd
`
`502
`
`
`
`Petitioner Microsoft Corporation - Ex. 1001, p. 9
`Petitioner Microsoft Corporation - EX. 1001, p. 9
`
`
`
`US. Patent
`
`May 29, 2007
`
`Sheet 7 of 20
`
`US 7,225,324 82
`
`
`
`
`
`
`DJMEmmZmommum
`
`
`
`DAMEmUIDOwnw
`
`NNO
`
`
`
`suing»).9mm
`
`Er:
`
`.vmm
`
`
`
`wFOImmm>0n_OO:_
`
`
`
`
`
`mmojwI._.n_moamt/On_OO._
`
`Em
`
`>._._UO._m>n>
`
`flaw;9mm7
`
`m5
`
`
`
`-uIEE<EJ_Em
`
`|
`mmm
`N:Nm9;
`+ul_w_hn_<5_I]
`>IEE¢§_/
`WEE;j
`
`9.
`
`a
`
`mmat
`
`H)
`
`oww
`
`
`
`
`
`
`
`H
`
`P
`
`|.|u:
`+u|EE21,,
`.ix|_E<s_T
`645%J
`3532J1N8
`
`
`
`omm
`
`vmm
`
`wmm
`
`Petitioner Microsoft Corporation - Ex. 1001, p. 10
`Petitioner Microsoft Corporation - EX. 1001, p. 10
`
`
`
`
`
`
`
`
`
`US. Patent
`
`May 29, 2007
`
`Sheet 8 of 20
`
`US 7,225,324 32
`
`OS
`
`
`
`Em
`
`mmm
`
`ENS9mm
`
`fiEEE
`
`
`
`
`
`mam
`
`arm
`
`:9r51mEEmFnmfiwn22
`
`
`
`mHOIwmm>0n_OOn_
`
`vmm
`
`
`
`Petitioner Microsoft Corporation - Ex. 1001, p. 11
`Petitioner Microsoft Corporation - EX. 1001, p. 11
`
`
`
`
`
`
`
`US. Patent
`
`May 29, 2007
`
`Sheet 9 of 20
`
`US 7,225,324 32
`
`
`
`
`
`
`
`
`
`
`
`
`-uuiEgz+u|EE<E
`
`
`
`
`
`
`owe
`
`NmO<S=
`
`
`
`mmm.mEm
`
`8atTID
`
`
`
` wFOIwamt/O100...
`
`39N51m525ENnmfimn_<_>_
`
`
`
`Petitioner Microsoft Corporation - Ex. 1001, p. 12
`Petitioner Microsoft Corporation - EX. 1001, p. 12
`
`
`
`
`
`
`US. Patent
`
`May 29, 2007
`
`Sheet 10 of 20
`
`US 7,225,324 32
`
`
`
`m._.Dn=>_OOIODOKTE.mmQZ<rmm52_._.ZOO
`
`
`
`wFOImmm>0n_OO._
`
`vmm
`
`mmm
`
`
`
`
`
`
`
`
`
`Nmoi):
`
`645%:
`
`
`
`
`
`
`
`
`
`
`
`weatTI
`
`
`
`Petitioner Microsoft Corporation - Ex. 1001, p. 13
`Petitioner Microsoft Corporation - EX. 1001, p. 13
`
`
`
`
`US. Patent
`
`May 29, 2007
`
`Sheet 11 0f 20
`
`US 7,225,324 32
`
`3m
`
`mmm
`
`
`
`HEN?)9mm
`
`wn_m_._.w
`
`mum
`
`
`
`
`
`mmojmI._.n_mDmm>On_OO._
`
`mHOImmw>O@004
`
`wmm
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`KIEEE
`
`Petitioner Microsoft Corporation - Ex. 1001, p. 14
`Petitioner Microsoft Corporation - EX. 1001, p. 14
`
`
`
`
`US. Patent
`
`May 29, 2007
`
`Sheet 12 of 20
`
`US 7,225,324 32
`
`
`
`wrm FNmoqé35%:
`
`
`
`
`
`
`
`I‘m
`
`
`
`flew?99”.
`
`NNmon
`
`
`
`
`
`
`
`
`
`$1EE<2
`
`D
`
`
`
`
`
`“.0ZO_._.<OOn_Omn_DESSZEODmIHm32_._.zoo
`
`
`
`meEmI._.n_m0MI._."5._._¢mm>0mm0725
`
`Nww
`
`
`
`mejmI._.n_m_Dmm>0n_OO._
`
`mn_m_._.w
`
`
`
`WHOIWmm>0n_OO._
`
`vmm
`
`
`
`Petitioner Microsoft Corporation - Ex. 1001, p. 15
`Petitioner Microsoft Corporation - EX. 1001, p. 15
`
`
`
`
`
`
`US. Patent
`
`May 29, 2007
`
`Sheet 13 of 20
`
`US 7,225,324 32
`
`
`
`
`
`ME.2_0mm:wmngd.)
`
`
`
`EOE“.m._.3n__200
`
`Bio029.8...sz
`
`v7:2.
`
`xn
`.fi
`._
`
`_mO
`
`Petitioner Microsoft Corporation - Ex. 1001, p. 16
`Petitioner Microsoft Corporation - EX. 1001, p. 16
`
`
`
`
`
`US. Patent
`
`May 29, 2007
`
`Sheet 14 of 20
`
`US 7,225,324 32
`
`o h
`
`.
`.9
`u.
`
`TIME
`
`SET‘I
`
`7’18
`
`720
`
`722
`
`724
`
`Petitioner Microsoft Corporation - Ex. 1001, p. 17
`Petitioner Microsoft Corporation - EX. 1001, p. 17
`
`
`
`US. Patent
`
`May 29, 2007
`
`Sheet 15 of 20
`
`US 7,225,324 32
`
`
`
`
`
`TIMESET2
`
`
`
`TIMESET1
`
`
`
`Petitioner Microsoft Corporation - Ex. 1001, p. 18
`Petitioner Microsoft Corporation - EX. 1001, p. 18
`
`
`
`US. Patent
`
`May 29, 2007
`
`Sheet 16 of 20
`
`US 7,225,324 32
`
`Iain/g.
`IIIIg.
`IIIEIMIFWW
`IIIIV1E4
`,.,
`IIQFV
`
` [IE—56%
`IIIIWIFW:
`
`‘IIH
`
`MEIIH
`
`
`
`ma‘nHH”a”...._.
`EHH.EWWWWWW
`
`wsE.
`
`Petitioner Microsoft Corporation - Ex. 1001, p. 19
`Petitioner Microsoft Corporation - EX. 1001, p. 19
`
`>:2_M_z_-umodsm
`WMW
`
`.__M._
`
`IIIIH
`
`__
`
`Ina__
`
` «EngWIIHW
`
`EIHWMWWIIIIIVAfilvuiIHWn—I
`.IIlFfli
`IIEFAJIEIH_
`
`
`
`
`
`
`
`
`
`US. Patent
`
`May 29, 2007
`
`Sheet 17 0f 20
`
`US 7,225,324 82
`
`
`
`m._.3n__>_00MI._.7:0mm:mm3._<>
`
`Petitioner Microsoft Corporation - Ex. 1001, p. 20
`Petitioner Microsoft Corporation - EX. 1001, p. 20
`
`
`
`
`US. Patent
`
`May 29, 2007
`
`Sheet 18 0f 20
`
`US 7,225,324 82
`
`“amonataj«auxn:ifix95EEnigma
`«gm?u:+__mx9”SEEmolmmm
`
`‘\6%
`
`E
`
`
`
`
`
`ME;
`
`Petitioner Microsoft Corporation - Ex. 1001, p. 21
`Petitioner Microsoft Corporation - EX. 1001, p. 21
`
`
`
`
`
`
`US. Patent
`
`May 29, 2007
`
`Sheet 19 0f 20
`
`US 7,225,324 32
`
`35
`
`.23
`LL
`
`WHF#24#224%
`24%$27k24W#7“
`
`Petitioner Microsoft Corporation - Ex. 1001, p. 22
`Petitioner Microsoft Corporation - EX. 1001, p. 22
`
`
`
`US. Patent
`
`May 29 2007
`
`Sheet 20 0f 20
`
`Petitioner Microsoft Corporation - Ex. 1001, p. 23
`
`
`
`US 7,225,324 B2
`
`2
`
`in
`
`15
`
`1
`MULTI-ADAP'I'IVE PROCESSING SYSTEMS
`AND TECHNIQUES FOR ENHANCING
`PARALLELISM AND PERFORMANCE OF
`COMPUTATIONAL FUNCTIONS
`
`CROSS Rl'il’]'£I{l.iN('.‘l'i T(J RIEIA'I‘iiD PJVI'i-ZN'I‘
`APPLICATIONS
`
`The present invention is related to the subject matter of
`US. patent application Ser. No. ()9i’755344 filed Jan. 5.
`2001 for: “Multiprocessor Computer Architecture Incorpo-
`rating a Plurality of Memory Algorithm Processors in the
`Memory Subsystem” and is further related to the subject
`matter of U S Pat. No. 6.434.687 for: “System and Method
`for Accelerating Web Site Access and Processing Utilizing a
`Computer System incorporating Reconfigurable Processors
`Operating Under a Single Operating System Image”. all of
`which are assigned to SRC Computers.
`Inc.. Colorado
`Springs, Colo. and the disclosures of which are herein
`specifically incorporated in their entirely by this reference.
`
`COPYRIGHT NOTICEWERMISSION
`
`A portion of the disclosure of this patent document may
`contain material which is subject to copyright protection.
`The copyright owner has no objection to the facsimile
`reproduction by anyone of the patent document or the patent
`disclosure as it appears in the United States Patent and
`Trademark Office patent
`file or records. but” otherwise.
`reserves all copyright rights whatsoever. The following
`notice applies to the software and data and described below.
`inclusive of the drawing figures where applicable: Copy-
`rightffli' 2000, SRC Computers, Inc.
`
`BACKGROUND OF THE INVENTION
`
`The present invention relates, in general. to the field of
`computing systems and techniques. More particularly, the
`present invention relates to multi—adaptive processing sys—
`tems and techniques for enhancing parallelism and perfor—
`mance of computational functions.
`Currently. most large so [tware applications achieve high
`performamce operation through the use of parallel process-
`ing. This technique allows multiple processors to work
`simultaneously on the same problem to achieve a solution in
`a fraction of the time required for a single processor to
`accomplish the same result. The processors in use may be
`performing many copies of the same operation. or may be
`performing totally different operations, but in either case all
`processors are working simultaneously.
`The use of such para lle] processing has led to the prolif—
`eration of both mnlti-processor boards and large scale clus-
`tered systems. Ilowever. as more and more performance is
`required. so is more parallelism, resulting in ever larger
`systems. Clusters exist today that have tens of thousands of
`processors and can occupy football fields of space. Systems
`of such a large physical size present” lnany obvious down-
`sides. including, among other factors. facility requirements,
`power. heat generation and reliability.
`
`SUMMARY OI" ‘l‘IIl-i lNVl'iN'I'ION
`
`However. if a processor technology could be employed
`that ofl'ers orders of magnitude more parallelism per pro—
`cessor. these systems could be reduced in size by a compa—
`rable factor. Such a processor or processing element
`is
`possible through the use of a reconfigurable processor.
`
`Reconfigurable processors instantiate only the functional
`units needed to solve a particular application. and as a result.
`have available space to instantiate as many functional units
`as may be required to solve the problem up to the total
`capacity of the integrated circuit chips they employ.
`At present, reconfigurable processors, such as multi-
`adaptive processor elements (MAPTM. a trademark ofSRC
`Computers. Inc.) can achieve two to three orders of magni—
`tude more parallelism and performance than state-of-tlle-art
`microprocessors. Through the advantageous application of
`adaptive processing techniques as disclosed herein, this type
`of reconfigurable processing parallelism may be employed
`in a variety of applications resulting in significantly higher
`perfonnance than that which can now be achieved while
`using significantly smaller mid less expensive computer
`systems.
`However. in addition to these benefits. there is an addi—
`tional much less obvious one that can have even greater
`impact on certain applications and has only become avail-
`able with the advent of multi-million gate reconfigurable
`chips. Performance gains are also realized by reconfigurable
`processors due to the much tighter coupling of the parallel
`functional units within each chip than can be accomplished
`in a microprocessor based computing system.
`In a multi-processor, micmprocessor—hased system. each
`processor is allocated but a relatively small portion of the
`total problem called a cell. However. to solve the total
`problem. results ol‘one processor are often required by many
`adjacent cells because their cells interact at the boundary and
`upwards of six or more cells. all having to interact
`to
`compute results, would not be uncommon. Consequently,
`intermediate results must be passed around the system in
`order to complete the computation of the total problem. This,
`of necessity, involves numerous other chips and bosses that
`run at much slower speeds than the microprocessor thus
`resulting in system performance ofieu many orders of mag—
`nitude lower than the raw computation time.
`()n the other hand, in the use of an adaptive processor-
`based system, since ten to one thousand times more coin-
`putations can be performed within a single chip. any bound—
`ary data that is shared between these functional units need
`never leave a single integrated circuit chip. Therefore, data
`moving around the system. and its impact on reducing
`overall system performance, can also be reduced by two or
`three orders of magnitude. This will allow both significant
`improvements in perfomJance in certain applications as well
`as enabling certain applications to be performed in a prac-
`tical timeframe that could not previously be accomplished.
`Particularly disclosed herein is a method for data process—
`ing in a reconfigurable computing system comprising a
`plurality of functional units. The method comprises: defin—
`ing a calculation for the reconfigurable computing system;
`instantiating at least two of the functional units to perfonn
`the calculation; utilizing a first of the functional units to
`. operate upon a subsequent data dimension of the calculation
`and substantially concurrently utilizing a second of the
`functional units to operate upon a previous data dimension
`of the calculation.
`Further disclosed herein is a method for data processing
`in a reconfigurable computing system comprising a plurality
`of flmctional units. The method comprises: defining a first
`systolic wall comprising roWs of cells forming a subset of
`the plurality of functional units; computing a value at each
`of the cells in at least a first row of the first systolic wall;
`. communicating the values between cells in the first row of
`the cells to produce updated values; communicating the
`updated values to a second row of the first” systolic wall: and
`
`30
`
`.
`
`40
`
`.
`
`50
`
`60
`
`Petitioner Microsoft Corporation - Ex. 1001, p. 24
`Petitioner Microsoft Corporation - EX. 1001, p. 24
`
`
`
`3
`
`4
`
`US 7,225,324 B2
`
`substantially concurrently providing the trpdated values to a
`first row of a second systolic wall of rows of cells in the
`subset of the plurality of functional units.
`.Also disclosed herein is a method for data processing in
`a reconfigurable processing systeln which includes selling
`up a systolic processing form employing a speculative
`processing strategy.
`
`HRIISF DESCRIPTION OF THE DRAWINGS
`
`The aforementioned and other features and objects of the
`present invention and the manner of attaining them will
`become more apparent and the invention itself will be best
`understood by reference to the following description of a
`preferred embodiment taken in conjunction with the accom-
`panying drawings, wherein:
`FIG. 1 is a simplified functional block diagram of typical
`clustered inter—processor communications path in a conven—
`tional multi-proccssor computing system;
`FIG. 2 is a functional block diagram of an adaptive
`processor conuntrnications path illustrating the many flinc-
`tiona] units (“FU”) interconnected by reconfigurable routing
`resources within the adaptive processor chip;
`FIG. 3A is a graph of the actual perfonnance improve-
`ment versus the nulnber of processors utilized and illustrat-
`ing the deviation from perfect scalability of a particular
`application utilizing a conventional multi—processor com—
`puting system such as that illustrated in FIG. 1;
`FIG. 313 is a corresponding graph of the actual perfor-
`mance improvement versus the number of processors uti-
`lized and illustrating the performance improvement over a
`conventional multi—processor computing system utilizing an
`adaptive processor—based computing system such as that
`illustrated in FIG. 2;
`illustrating a
`FIG. 4A is a simplified logic flowchart
`conventional
`sequential processing operation in which
`nested Loops A and B are alternately active on different
`phases of the process;
`FlG. 413 is a comparative, simplified logic l1owehat’l
`illustrating multi-dimensional processing in accordance with
`the technique of the present
`invention wherein multiple
`dimensions of data are processed by both Loops A and B
`such that the computing system logic is operative on every
`clock cycle;
`FIG. 5A is illustrative of a general process for perfonning
`a representative multi-dimensional pipeline operation in the
`form of a seismic migration imaging function utilizing the
`parallelism available in the utilization of the adaptive pro—
`cessing techniques of the present invention;
`FIG. 51-3 is a follow-oil illustration of the computation
`phases employed in implementing the exemplary seismic
`migration imaging function of the preceding figure;
`FIG. 6A is a simplified logic flowchart for a particular
`seismic migration imaging application illustrative of the
`parallelism provided in the use of an adaptive processor-
`based comptrling system;
`FIG. 6B illustrates the computational process which may
`be employed by a microprocessor in the execution of the
`seismic imaging application of the preceding figure:
`FIG. 6C illustrates the first step in the computational
`process which may be employed by an adaptive processor in
`the execution of the seismic imaging application of FIG. 6A
`in which a first shot (81) is started;
`FIG. 6D illustrates the second step in the same compli—
`tational process for the execution of the seismic imaging
`application of FIG. 6A in which a second shot [32) is started:
`
`FIG. 613 illustrates the third step in the salne computa-
`tional process for the execution of the seismic imaging
`application ofFIG. 6A in which the operation on the first and
`second shots is continued through compute;
`FIG. 6F illustrates the fourth step in the same computa-
`tional process showing the subsequent