throbber
UTILITY
`
`PATENT APPLICATION
`TRANSMITTAL
`{Only {or new nenpmvisional applications under
`3'." CFR 1 .53ib11
`
`First Inventor
`
`Title
`
`Jon M. Huppenthal et at.
`
`"'I’iitlJLTl-ADAPTNE PROCESSING
`srsrems AND TECHNIQUES FOR
`ENHANClNG PARALLELiSM AND
`
`PERFORMANCE OF COMPUTATIONafiL.
`FUNCTIONS
`
`_
`
`llllllllllllllllllllllllllll2.},
`
`APPLICATION ELEMENTS
`
`Fee Transmittal Form
`[submit an original and a duplicate for tee processing}
`Applicant claims small entity status.
`See 3? CFR 1.27
`total pages _ ‘34__
`Specification [
`[preferred Arrangemenl set Ionh below]
`— Descriptive title ot the Invention
`— Cross References to Related Applications
`- Statement Regarding Fed sponsored Raid]
`- Reference to sequence listing. a table, or a
`computer program listing appendix
`. Background of the invention
`- Brief Summary of the invention
`- Brief Description of the Drawings
`— Detailed Description
`- Claimts)
`- Abstract of the Disciosure
`
`4.
`5.
`
`IE Drawing(s)[ total Sheets 20 ]
`Oath or Declaration [
`total pages _,3__
`lZl UNEXECUTED [original or copy)
`a.
`13. 1:] COpy from prior appl. (3? C.F.R. §1.63{d)}
`{for continuationfdlvisionnl with Box 13 completed)
`i. DDELETION OF INVENTOELSJ
`Signed statement attached deleting
`inventorts] named in prior application,
`see a? c.r-.R. §§ t 53mg; andt.33(bj.
`
`Assistant Commissioner for Patents
`Box Patent Application
`Washington. DC 20231
`
`G. I: Application Data Sheet. {See 3? CFR 1,?6]
`
`i'. E] CID—ROM or CD~R in duplicate, large table
`or Computer Program (Appendix)
`8. Nucleotide andior Amino Acid Sequence Submission
`(if applicable, all necessary)
`Computer Readable Forrn
`
`Spcciiication Sequence Listing on:
`i.
`[:l (ID—ROM or CD-R (2 copies): or
`ii. [:I paper
`Statements verifying identity of above copies
`ACCOMPANYING APPLICATION PARTS
`9. E] Assignment Papers (coversheefldocumenttsll
`10_ [:1
`3? CFR. 3.73m) Statement
`E]
`iner of
`{when there is an assignee)
`Attorney
`English Translation Docu merit
`
`iDS & Form PTOJSBEOSA
`
`CI Copies 01 IDS
`Citations
`
`11D
`12.13
`
` 10/31/02
`
`Preliminary Amendment
`13. El
`14.
`[2] Return Receipt Postcard (MPEP 503)
`15. 1:] Certified Copy ol Priority Documentls)
`16.
`[:1 Nonpublication Request Under 35 USC
`122(h}(2)(B)(i).Applicant must attach form
`PTOESBI'BS
`Other: Certificate of Mailin-
`
`1?.
`
`If a CONTINUING APPLICATION. check appropriate box. and supply the requisite information below and in a
`18.
`preliminary amendment, or in an Application Data Sheet under 37 CFR 1.76:
`
`{:1 Continuation El Divisional El Continuation-in-parHClP)
`Prior application information:
`Examiner:
`
`of prior apptication No.: _ i
`Group/Arr Um};
`59LCON—lawn”““79" QB_DIVlSlDNfl..APP§ only. The enllre disclosure of the [MIDI applicalinn_ lrorn which an mm or ueuarairon Is supplied under Box 5h. ‘5 CDI’ISJdQKEt'J a
`part or Inn disclosure at the occonmanying :mt-nuatiun rir dwisional application and is hereby inc
`Gilt-“Hated by relercnce The incorporat-on gt)" only he relied upon when a
`- w nor: has been inadvertent! onwtted Form the suornilleu a treatise one.
`
`Customer Number or Bar Code Label 25235
`
`or [I Correspondence addeS below
`
`19. CORRESPONDENCE ADDRESS
`
`
`
`
`
`\'\\(_,’5 - SD-It'l-INIDIS 56592 vl
`
`Petitioner Microsoft Corporation - Ex. 1002, p. 1
`
`Petitioner Microsoft Corporation - EX. 1002, p. 1
`
`

`

` .
`
`PATENT
`EXPRESS MAIL NO. EV035493558US
`Attorney Docket No. SR0015
`Clienthatter No. 804040018
`IN THE UNITED STATES PATENT AND TRADEMARK OFFICE
`
`In re Application of:
`
`Jon M. Huppcnthal and David E. Caliga
`
`Serial No. NEW
`
`Filed: Herewith
`
`For: MULTI—ADAPTIVE PROCESSING
`
`SYSTEMS AND TECHNIQUES FOR
`ENHANCING PARALLELISM AND
`PERFORMANCE OF
`COMPUTATIONAL FUNCTIONS
`
`
`
`Box PATENT APPLICATION
`Assistant Commissioner for Patents
`
`Washington, DC. 20231
`
`Sir:
`
`msesww
`
`The undersigned hereby certifies that the following documents:
`Utility Patent Application Transmittal;
`Utility Patent Application;
`Unexecuted Declaration for Utility Patent Application;
`20 sheets of drawings;
`Return postcard; and
`Certificate of Mailing By Express Mail
`.
`relating to the above application, were deposited as "Express Mail", Mailing Label
`No. EVO35493558US, with the United States Postal Service, addressed to Box PATENT
`AgPL5CATION, The Assistant Commissioner for Patents, Washington, D.C., 20231,
`i
`gflhg 7032- .
`
`
`
`bl O$fiw ¢mL
`Date
`
`Date
`
` _
`434’ @zo/icw 2,001;
`
`F 2..
`
`. _‘_‘. 4'
`
`ubida, Reg. No. 29,664
`William J.
`HOGAN & HARTSON LL?
`One Tabor Center
`
`1200 17th Street, Suite 1500
`Denver. Colorado 80202
`(719) 448-5909 Tel
`(303) 899—7333 Fax
`
`\\\C5 - 8040410018 . 55552 v]
`
`
`
`Petitioner Microsoft Corporation - Ex. 1002, p. 2
`
`Petitioner Microsoft Corporation - EX. 1002, p. 2
`
`

`

`a? 5'” -,;..=
`"
`.iwlzfii iii-I... 4...:.
`"
`
`
`Attorney Docket No. SRC015
`Client NO. 80404.0018
`Express Mail No. EVO3549355805
`
`PATENT
`
`MULTIwADAPTIVE PROCESSING SYSTEMS AND TECHNIQUES FOR
`ENHANCING PARALLELISM AND PERFORMANCE OF COMPUTATIONAL
`FUNCTIONS
`
`CROSS REFERENCE TO RELATED PATENT APPLICATIONS
`
`The present
`
`invention is related to the subject
`
`matter of United States Patent Application Ser. No.
`
`09/755,744 filed January 5, 2001 for: “Multiprocessor
`
`Computer Architecture Incorporating a Plurality of
`
`Memory Algorithm Processors in the Memory Subsystem”
`
`and is further related to the subject matter of United
`
`States Patent No. 6,454,687 for: “System and Method
`
`for Accelerating Web Site Access and Processing
`
`Utilizing a Computer System Incorporating
`
`Reconfigurable Processors Operating Under a Single
`
`Operating System Image", all of which are assigned to
`
`SRC Computers,
`
`Inc., Colorado Springs, Colorado and
`
`the disclosures of which are herein specifically
`
`incorporated in their entirety by this reference-
`
`COPYRIGHT NOTICE/PERMISSION
`
`A portion of the disclosure of this patent
`
`document may contain material which is subject to
`
`copyright protection.
`
`The copyright owner has no
`
`objection to the facsimile reproduction by anyone of
`
`the patent document or the patent disclosure as it
`
`appears in the United States Patent and Trademark
`
`Office patent file or records, but otherwise, reserves
`
`all copyright rights whatsoever.
`
`The following notice
`
`applies to the software and data and described below,
`
`inclusive of the drawing figures where applicable:
`
`Copyright © 2000, SRC Computers,
`
`Inc.
`
`10
`
`15
`
`20
`
`25
`
`\\\C5 - 8040610018 - Self-E v1.
`
`Petitioner Microsoft Corporation - Ex. 1002, p. 3
`
`Petitioner Microsoft Corporation - EX. 1002, p. 3
`
`

`

`m aa
`
`BACKGROUND OF THE INVENTION
`
`The present invention relates,
`
`in general,
`
`to the
`
`field of computing systems and techniques. More
`
`particularly,
`
`the present invention relates to multi—
`
`adaptive processing systems and techniques for
`
`enhancing parallelism and performance of computational
`functions.
`
`Currently, most
`
`large software applications
`
`achieve high performance operation through the use of
`
`parallel processing. This technique allows multiple
`
`processors to work simultaneously on the same problem
`to achieve a solution in a fraction of the time
`
`required for a single processor to accomplish the same
`
`result.
`
`The processors in use may be performing many
`
`copies of the same operation, or may be performing
`
`totally different operations, but
`
`in either case all
`
`prooessors are working simultaneously.
`
`The use of such parallel processing has led to
`
`the proliferation of both multi—processor boards and
`
`large scale clustered systems. However, as more and
`
`more performance is required,
`
`so is more paralleliSm,
`
`resulting in ever larger systems. Clusters exist
`
`today that have tens of thousands of processors and
`
`can occupy football fields of space.
`
`Systems of such
`
`a large physical size present many obvious downsides,
`
`including,
`
`among other factors, facility requirements,
`
`power, heat generation and reliability.
`SUMMARY OF THE INVENTION
`
`However,
`
`if a processor technology could be
`
`employed that offers orders of magnitude more
`
`parallelism per processor,
`
`these systems could be
`
`reduced in size by a comparable factor-
`
`Such a
`
`processor or processing element is possible through
`
`10
`
`15
`
`20
`
`25
`
`30
`
`\\\Cfi
`
`* BOiO‘fUtIIH - 56156 V1
`
`Petitioner Microsoft Corporation - Ex. 1002, p. 4
`
`Petitioner Microsoft Corporation - EX. 1002, p. 4
`
`

`

`:3;
`'Er‘v- '13-: £3.11; aft: 12? W7.
`=4?
`mam-ml no... min 40“ Wu -.u.--4...:~
`
`_
`
`a»
`
`'_ -
`
`
`the use of a reconfigurable processor. Reconfigurable
`
`processors instantiate only the functional units
`
`needed to solve a particular application, and as a
`
`result, have available space to instantiate as many
`
`functional units as may be required to solve the
`
`problem up to the total capacity of the integrated
`
`circuit chips they employ.
`
`At present,
`
`reconfigurable processors, such as
`
`multi—adaptive processor elements {MAPm, a trademark
`
`of SRC Computers,
`
`Inc.) can achieve two to three
`
`orders of magnitude more parallelism and performance
`
`than state—of—the—art microprocessors.
`
`Through the
`
`advantageous application of adaptive processing
`
`techniques as disclosed herein,
`
`this type of
`
`reconfigurable processing parallelism may be employed
`
`in a variety of applications resulting in
`
`significantly higher performance than that which can
`
`now be achieved while using significantly smaller and
`
`less expensive computer systems.
`
`However,
`
`in addition to these benefits,
`
`there is
`
`an additional much less obvious one that can have even
`
`greater impact on certain applications and has only
`
`become available with the advent of multi—million gate
`
`reconfigurable chips. Performance gains are also
`
`realized by reconfigurable processors due to the much
`
`tighter coupling of the parallel functional units
`
`within each chip than can be accomplished in a
`
`microprocessor based computing system.
`
`In a multi—processor, microprocessor—based
`
`system, each processor is allocated but a relatively
`
`small portion of the total problem called a cell.
`
`However,
`
`to solve the total problem, results of one
`
`processor are often required by many adjacent cells
`
`because their cells interact at the boundary and
`
`10
`
`15
`
`20
`
`25
`
`30
`
`\\\C5 ‘- EOQDGIDOIB — 56166 \rl
`
`Petitioner Microsoft Corporation - Ex. 1002, p. 5
`
`Petitioner Microsoft Corporation - EX. 1002, p. 5
`
`

`

`
`
`upwards of six or more cells, all having to interact
`
`to compute results, would not be uncommon.
`
`Consequently,
`
`intermediate results must be passed
`
`around the system in order to complete the computation
`
`of the total problem. This, of necessity,
`
`involves
`
`numerous other chips and busses that run at much
`
`slower speeds than the microprocessor thus resulting
`
`in system performance often many orders of magnitude
`
`lower than the raw computation time.
`
`On the other hand,
`
`in the use of an adaptive
`
`processor—based system, since ten to one thousand
`
`times more computations can be performed within a
`
`single chip, any boundary data that is shared between
`
`these functional units need never leave a single
`
`integrated circuit chip. Therefore, data moving
`
`around the system, and its impact on reducing overall
`
`system performance, can also be reduced by two or
`
`three orders of magnitude. This will allow both
`
`significant improvements in performance in certain
`
`applications as well as enabling certain applications
`
`to be performed in a practical timeframe that could
`
`not previously be accomplished.
`
`Particularly disclosed herein is a method for
`
`data processing in a reconfigurable computing system
`
`comprising a plurality of functional units.
`
`The
`
`method comprises: defining a calculation for the
`
`reconfigurable computing system;
`
`instantiating at
`
`least two of the functional units to perform the
`
`calculation; utilizing a first of the functional units
`
`to operate upon a subsequent data dimension of the
`
`calculation and substantially Concurrently utilizing a
`second of the functional units to operate upon a
`-
`previous data dimension of the calculation.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`\\\CS - BOdO-fllDOlB - 56166 It!
`
`Petitioner Microsoft Corporation - Ex. 1002, p. 6
`
`Petitioner Microsoft Corporation - EX. 1002, p. 6
`
`

`

`
`
`Further disclosed herein is a method for data
`
`processing in a reconfigurable computing system
`
`comprising a plurality of functional units. The method
`
`comprises: defining a first systolic wall comprising
`
`rows of cells forming a subset of the plurality of
`
`functional units; computing a value at each of the
`
`cells in at
`
`least a first row of the first systolic
`
`wall; communicating the values between cells in the
`
`first row of the cells to produce updated values;
`
`10
`
`communicating the updated values to a second row of
`
`the first systolic wall; and substantially
`
`concurrently providing the updated values to a first
`
`row of a second systolic wall of
`
`rows of cells in the
`
`subset of the plurality of functional units.
`Also disclosed herein is a method for data
`
`15
`
`processing in a reconfigurable processing system which
`
`includes setting up a systolic processing form
`
`employing a speculative processing strategy.
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`2O
`
`25
`
`The aforementioned and other features and objects
`
`of the present invention and the manner of attaining
`
`them will become more apparent and the invention
`
`itself will be best understood by reference to the
`
`following de5cription of a preferred embodiment
`
`taken
`
`in conjunction with the accompanying drawings,
`wherein:
`
`Fig.
`
`l is a simplified functional block diagram
`
`of typical clustered intereprocessor communications
`
`path in a conventional multieprocessor computing
`
`30
`
`system;
`
`.Fig.
`
`2 is a functional block diagram of an
`
`adaptive processor communications path illustrating
`
`the many functional units {“FU”)
`
`interconnected by
`
`\\\(.’5 - 5040430015 - 56166 v'l
`
`Petitioner Microsoft Corporation - Ex. 1002, p. 7
`
`Petitioner Microsoft Corporation - EX. 1002, p. 7
`
`

`

`:
`
`EFF: f"!-
`
`reconfigurable routing resources within the adaptive
`processor chip;
`
`Fig.
`
`3A is a graph of the actual performance
`
`improvement versus the number of processors utilized
`
`and illustrating the deviation from perfect
`
`scalability of a particular application utilizing a
`
`conventional multi—processor computing system such as
`
`that illustrated in Fig. 1;
`
`Fig.
`
`3B is a corresponding graph of the actual
`
`performance improvement versus the number of
`
`processors utilized and illustrating the performance
`
`improvement over a conventional multi—processor
`
`computing system utilizing an adaptive processor—based
`
`computing system such as that illustrated in Fig. 2:
`
`Fig.
`
`4A is a simplified logic flowchart
`
`illustrating a conventional sequential processing
`
`operation in which nested Loops A and B are
`
`alternately active on different phases of the process;
`
`Fig. 48 is a comparative, simplified logic
`
`flowchart illustrating multi—dimensional processing in
`
`accordance with the technique of the present
`
`invention
`
`wherein multiple dimensions of data are processed by
`
`both Loops A and B such that the computing system
`
`logic is operative on every clock cycle;
`
`Fig.
`
`5A is illustrative of a general process for
`
`performing a representative multi—dimensional pipeline
`
`operation in the form of a seismic migration imaging
`
`function utilizing the parallelism available in the
`
`utilization of the adaptive processing techniques of
`
`the present
`
`invention;
`
`Fig. SE is a follow—on illustration of the
`
`computation phases emplOyed in implementing the
`
`exemplary seismic migration imaging function of the
`
`10
`
`15
`
`20
`
`25
`
`30
`
`preceding figure;
`
`\\\E5 . BuflndanIE - 56166 v]
`
`Petitioner Microsoft Corporation - Ex. 1002, p. 8
`
`Petitioner Microsoft Corporation - EX. 1002, p. 8
`
`

`

`
`
`Fig.
`6A is a simplified logic flowchart for a
`particular seismic migration imaging application
`
`illustrative of the parallelism provided in the use of
`
`an adaptive processor—based computing system;
`
`Fig 63 illustrates the computational process
`
`which may be employed by a microprocessor in the
`
`execution of the seismic imaging application of the
`
`preceding figure;
`
`10
`
`15
`
`20
`
`Fig.
`
`6C illustrates the first step in the
`
`computational process which may be employed by an
`
`adaptive processor in the execution of the seismic
`
`imaging application of Fig.
`
`6A in which a first shot
`
`(81)
`
`is started;
`
`Fig.
`
`6D illustrates the second step in the same
`
`Computational process for the execution of the seismic
`
`imaging application of Fig.
`
`6A in which a second shot
`
`(S2)
`
`is started;
`
`Fig.
`
`6E illustrates the third step in the same
`
`computational process for the execution of the seismic
`
`imaging application of Fig.
`
`6A in which the operation
`
`on the first and second shots is continued through
`
`compute;
`
`Fig.
`
`6F illustrates the fourth step in the same
`
`computational process showing the subsequent operation
`
`25
`
`on shots 81 and 52;
`
`Fig.
`
`6G illustrates the fifth step in the same
`
`computational process followed by the continued
`
`downward prepagation of shots Si and 32 over all of
`
`the depth slices;
`
`30
`
`Fig.
`
`7A illustrates a process for performing a
`
`representative systolic wavefront operation in the
`
`form of a reservoir simulation function also utilizing
`
`the parallelism available in the utilization of the
`
`\\\C5 - 30110420018 - 56166 \fl
`
`Petitioner Microsoft Corporation - Ex. 1002, p. 9
`
`Petitioner Microsoft Corporation - EX. 1002, p. 9
`
`

`

`adaptive processing techniques of the present
`invention;
`
`'Fig. 73 illustrates the general computation of
`
`fluid flow properties in the reservoir simulation of
`
`the preceding figure which are communicated to
`
`neighboring cells:
`
`Fig.
`
`7C illustrates the creation of a systolic
`
`wall of computation at Time Set
`
`1 which has been
`
`started for a vertical wall of cells and in which
`
`communication of values between adjacent rows in the
`
`vertical wall can OCCur without storing values to
`
`memory:
`
`Fig.
`
`7D is a follow on illustration of the
`
`creation of a systolic wall of computation at Time Set
`
`1 and Time Set 2 showing how a second vertical wall of
`
`cells is started after the computation for cells in
`
`the corresponding row of the first wall has been
`
`completed:
`
`Fig.
`
`8A illustrates yet another process for
`
`performing a representative systolic wavefront
`
`operation in the form of the systolic processing of
`
`bioinformatics also utilizing the parallelism
`
`available in the utilization of the adaptive
`
`processing techniques of the present
`
`invention;
`
`Fig. 88 illustrates a systolic wavefront
`
`processing operation which further incorporates a
`
`speculative processing strategy based upon an
`
`evaluation of the rate of Change of X3;
`
`Fig.
`
`8C is a further illustration of the systolic
`
`wavefront processing operation of the preceding figure
`
`incorporating speculative processing:
`
`Fig.
`
`9A illustrates still another process for
`
`performing a representative systolic wavefront
`
`operation in the form of structure codes calculating
`
`10
`
`15
`
`20
`
`25
`
`30
`
`\\\CS - aoqouoon - 56165 v1
`
`Petitioner Microsoft Corporation - Ex. 1002, p. 10
`
`Petitioner Microsoft Corporation - EX. 1002, p. 10
`
`

`

`
`
`-'ME. (“1'Pusan—aw" as“ ”in ,..1|...-.‘..§\w «ma-2mg. .9- =9.1_ ..
`
`
`
`
`
`
`
`
`
`:7-9.“w «3' 17343“:.4 .1 '44- MJ“ ‘W‘m
`
`
`
`polynomials at grid intersections, again utilizing the
`
`parallelism available in the utilization of the
`
`adaptive processing techniques of the present
`
`invention;
`
`Fig. QB illustrates the computation start for a
`
`vertical wall of grid points at Time Set 1 for a
`
`polynomial evaluation performed on grid intersections
`wherein calculations between rows are done in a
`
`stochastic fashion using values from a previous row;
`and
`
`10
`
`9C is a further illustration of the
`Fig.
`polynomial evaluation performed on grid intersections
`
`of the preceding figure wherein a second wall is
`
`started after the cells in the corresponding row of
`
`the first wall have been completed.
`
`DESCRIPTION OF A REPRESENTATIVE EMBODIMENT
`
`This application incorporates by reference the
`
`entire disclosure of Caliga, D. et a1- “Delivering
`Acceleration: “The Potential for Increased HPC
`
`Application Performance Using Reconfigurable Logic”,
`8C2001, November 2001Ir ACM 1—58113—293—X/01/0011.
`
`With reference now to Fig.
`
`l,
`
`a simplified
`
`functional block diagram of typical clustered inter—
`
`processor communications path in a conventional multi-
`
`processor computing system 100 is shown.
`
`The computer
`
`system comprises a number of memory and input/output
`
`(“I/O” controller integrated circuits {“ICs”) 1020
`
`15
`
`20
`
`25
`
`through 102”,
`
`(e.g. “North Bridge”) 102 such as the
`
`P4X333/P4X400 devices available from VIA Technologies,
`
`30
`
`Inc.;
`
`the M1647 device available from Acer Labs,
`
`Inc.
`
`and the 824430X device available from Intel
`
`Corporation.
`
`The North Bridge IC 102 is coupled by
`
`means of a Front Side Bus
`
`(“F8B”)
`
`to one or more
`
`i\\CS — nomamma — $6166 v1
`
`Petitioner Microsoft Corporation - Ex. 1002, p. 11
`
`Petitioner Microsoft Corporation - EX.1002, p. 11
`
`

`

`
`
`10
`
`microprocessors 10400 though 10403 and lOdNO through
`
`104“; such as one of the Pentium® series of processors
`
`also available frOm Intel Corporation.
`
`The North Bridge ICs 1020 through 102" are coupled
`
`to respective blocks of memory 1060 through 106“ as
`
`well as to a corresponding I/O bridge element 1080
`
`through 108“.
`
`A network interface card (“NIC”)
`
`llOQ
`
`through 210“ couples the I/O bus of the respective I/O
`
`bridge 1080 through lOBN to a cluster bus coupled to a
`
`common clustering hub (or Ethernet Switch) 112.
`
`Since typically a maximum of four microprocessors
`
`104, each with two or four functional units, can
`
`reside on a Single Front Side Bus, any communication
`
`to more than four must pass over the Front Side Bus,
`
`inter-bridge bus,
`
`input/output
`
`(“I/O”) bus, cluster
`
`interconnect {e.g. an Ethernet clustering hub 112} and
`
`then back again to the receiving processor 104.
`
`The
`
`1/0 bus is typically an order of magnitude lower in
`
`bandwidth than the Front Side Bus, which means that
`
`any processing involving more than the four processors
`
`104 will be significantly throttled by the loose
`
`coupling caused by the interconnect. All of this is
`
`eliminated with a reconfigurable processor having
`
`hundreds or thousands of functional units per
`
`10
`
`15
`
`20
`
`25
`
`processor-
`
`With reference additionally now to Fig. 2, a
`
`functional block diagram of an adaptive processor 200
`
`communications path for implementing the technique of
`
`the present
`
`invention is shown.
`
`The adaptive
`
`30
`
`processor 200 includes an adaptive processor chip 202
`
`incorporates a
`
`large number of functional units (“F0")
`
`204 interconnected by reconfigurable routing
`
`resources.
`
`The adaptive processor chip 202 is coupled
`
`to a memory element 206 as well as an interconnect 208
`
`\\\CS - 80404.!0018 - 56166 v1
`
`Petitioner Microsoft Corporation - Ex. 1002, p. 12
`
`Petitioner Microsoft Corporation - EX. 1002, p. 12
`
`

`

`11
`
`and a number of additional adaptive processor chips
`210.
`
`As shown, each adaptive processor chip 202 can
`contain thousands of functional units 204 dedicated to
`
`the particular problem at hand.
`
`Interconnect between
`
`these functional units is created by reconfigurable
`
`routing resources inside each chip 202- As a result,
`
`the functional units 204 can share or exchange data at
`
`much higher data rates and lower latencies than a
`
`standard microprocessor 104
`
`(Fig. 1).
`
`In addition,
`
`the adaptive processor chips 202 can connect directly
`
`to the inter—processor interconnect 208 and do not
`
`require the data to be passed through multiple chips
`
`in a chipset
`
`in order to communicate. This is because
`
`the adaptive processor can implement whatever kind of
`
`interface is needed to accomplish this connection.
`
`with reference additionally now to Fig. 3A,
`
`a
`
`graph of the actual performance improvement versus the
`
`number of processors utilized in a conventional multi—
`
`processor computing system 100 (Fig.
`
`l}
`
`is shown.
`
`In
`
`this figure,
`
`the deviation from perfect scalability of
`
`a particular application is illustrated for such a
`
`system.
`
`With reference additionally now to Fig.
`
`3B,
`
`a
`
`corresponding graph of
`
`the actual performance
`
`imprOVement versus the number of processors utilized
`
`in an adaptive processor—based computing system 200
`
`(Fig. 2)
`
`is shown.
`
`In this figure,
`
`the performance
`
`improvement provided with an adaptive processor-based
`
`computing system 200 over that of a conventional
`
`multi—processor computing system 100 is illustrated.
`
`With reference additionally now to Fig. 4A,
`
`a
`
`simplified logic flowchart is provided illustrating a
`
`conventional sequential processing operation 400 in
`
`10
`
`15
`
`20
`
`25
`
`30
`
`\\\CS — 80404.!0018 - 56166 v1
`
`Petitioner Microsoft Corporation - Ex. 1002, p. 13
`
`Petitioner Microsoft Corporation - EX. 1002, p. 13
`
`

`

`
`
`12
`
`which nested Loops A (first loop 402) and B
`
`(second
`
`loop 404) are alternately active on different phases
`
`of the process.
`
`As shown,
`
`the standard implementation of
`
`applications that have a set of nested loops 402,404
`
`is to complete the processing of the first_loop 402
`
`before proceeding to the second loop 404.
`
`The problem
`
`inherent
`
`in this approach, particularly when utilized
`
`in conjunction with field programmable gate arrays
`
`(“FPGAs”)
`
`is that all of the logic that has been
`
`instantiated is not being completely utilized.
`
`With reference additionally now to Fig. 4B,
`
`a
`
`comparative, simplified logic flowchart is shown
`
`illustrating a multi—dimensional process 410 in
`
`accordance with the technique of the present
`
`invention.
`
`The multiedimensional process 410 is
`
`effectuated such that multiple dimensions of data are
`
`processed by both Loops A (first loop 412} and B
`
`(second loop 414}
`
`such that the computing system logic
`
`is operative on every clock cycle.
`
`In contrast to the sequential processing
`
`operation 400 (Fig.
`
`4A)
`
`the solution to the problem of
`
`most effectively utiliZing available resources is to
`
`have an application evaluate a problem in a data flow
`
`sense. That is, it will “pass” a subsequent dimension
`
`of a given problem through the first loop 412 of logic
`
`concurrently with the previous dimension of data being
`
`processed through the second loop 414.
`
`In practice,
`
`a
`
`“dimension” of data can be: multiple vectors of a
`
`problem, multiple planes of a problem, multiple time
`
`steps in a problem and so forth-
`
`with reference additionally now to Fig.
`
`5A,
`
`a
`
`general process for performing a representative multi—
`
`dimensional pipeline operation is shown in the form of
`
`10
`
`15
`
`2O
`
`25
`
`3O
`
`\\\CS - BOHDAKDDKH -- 56166 v1
`
`Petitioner Microsoft Corporation - Ex. 1002, p. 14
`
`Petitioner Microsoft Corporation - EX. 1002, p. 14
`
`

`

`13
`
`a seismic migration imaging function 500.
`
`The process
`
`500 can be adapted to utilize the parallelism
`
`available in the utilization of the adaptive
`
`processing techniques of the present
`
`invention in the
`
`form of a multi—adaptive processor
`
`(MAPW,
`
`a trademark
`
`of SRC Computers,
`
`Inc.r assignee of the present
`
`invention) STEPBd routine 502.
`
`The MAP STEP3d routine
`
`502 is operation to utilize velocity data 504, source
`
`data 506 and receiver data 508 to produce a resultant
`
`image 510 as will be more fully described hereinafter.
`
`With reference additionally now to Fig. SB,
`
`the
`
`MAP STEPBd routine 502 of the preceding figure is
`
`shown in the various computational phases of: MAPTRI_x
`
`520, MAPTRI_y 522, MAPTRI_d+ 524 and MAPTRl#d— 526.
`
`With reference additionally now to Fig.
`
`6A,
`
`a
`
`simplified logic flowchart for a particular seismic
`
`migration imaging application 600 is shown.
`
`The
`
`seismic migration imaging application 600 is
`
`illustrative of the parallelism provided in the use of
`
`an adaptive processor—based computing system 200 such
`
`as that shown in Fig. 2.
`
`The representative
`
`application 600 demonstrates a nested loop parallelism
`
`in the tri—diagonal solver and the same logic can be
`
`implemented for the multiple tri-diagonal solvers in
`
`the x, y, d+ and d— directions.
`
`The computational
`
`phases of: MAPTRI_x 520, MAPTRI_y 522, MAPTRI#d+ 524
`
`and MAPTRI_d— 526 are again illustrated.
`With reference additionally now to Fig. SE,
`
`a
`
`computational process 610 is shown which may be
`
`employed by a microprocessor {“mP")
`
`in the execution
`
`of the seismic imaging application 600 of the
`
`The process 610 includes the step
`preceding figure.
`612 of reading the source field [SIZOJ] and receiver
`
`field [RtZOJ] as well as the velocity field [VtZo)] at
`
`10
`
`15
`
`20
`
`25
`
`30
`
`\\\CS — 80404;?0018 - 56166 It!
`
`Petitioner Microsoft Corporation - Ex. 1002, p. 15
`
`Petitioner Microsoft Corporation - EX. 1002, p. 15
`
`

`

`-.J-L 15441 L2;— \L'.Ji; mil .fhl n»!— 145.1!
`
`II nil-y ‘lln—l‘ ”LLB-“AL. lib-I“ M
`
`14
`
`step 614. At step 616 values are computed for
`
`S(ZM},R(ZM) which step is followed by the phases
`
`MAPTRI_X 520 and MAPTRI_y 522. At step 618,
`
`the image
`
`of Z1” is computed. This is followed by the phases
`
`MAPTRI_d+ 524 and MAPTRI_d— 526 to produce the
`
`resultant image Z at step 620.
`
`The process 610 loops
`
`over the depth slices as indicated by reference number
`
`622 and loops over the shots as indicated by reference
`number 624.
`
`With reference additionally now to Fig. 6C,
`
`the
`
`first step in a computational process 650 in
`
`accordance with the technique of the present
`
`invention
`
`is shown in which a first shot
`
`(Si)
`
`is started.
`
`The
`
`process 650 may be employed by an adaptive processor
`
`(e.g.
`
`a MA?“ adaptive processor} as disclosed herein
`
`in the execution of the seismic imaging application
`
`600 of Fig. 6A.
`
`As
`
`indicated by the shaded block,
`
`the
`
`phase MAPTRI_x 520 is active.
`
`With reference additionally now to Fig.
`
`6D,
`
`the
`
`second step in the computational process 650 is shown
`
`at a point at which a second shot
`
`(82)
`
`is started.
`
`Again, as indicated by the shaded blocks,
`
`the phase
`
`the phase MAPTRIHy 522
`MAPTRI_x 520 is active for 82,
`is active for 81 and image 31m has been produced at
`
`step 618. As shown, adaptive processors in accordance
`
`with the disclosure of the present
`
`invention support
`
`computation pipelining in multiple dimensions and the
`
`parallelism in Z and shots is shown at step 612.
`
`With reference additionally now to Fig. 6E,
`
`the
`
`third step in the computational process 650 is shown
`
`in which the operation on the first and second shots
`
`is continued through compute. As
`
`indicated by the
`
`shaded blocks,
`
`the phase MAPTRI_d+ 524 is active for
`
`10
`
`15
`
`20
`
`25
`
`30
`
`\\\CS - 3010410018 - 56166 v]
`
`Petitioner Microsoft Corporation - Ex. 1002, p. 16
`
`Petitioner Microsoft Corporation - EX. 1002, p. 16
`
`

`

`15
`
`51,
`
`the phase MAPTRI_y 522 is active for S2 and image
`
`31m has been produced at step 618.
`
`With reference additionally now to Fig. 6?,
`
`the
`
`fourth step in the computational process 650 is shown
`
`illustrating the subsequent operation on shots 81 and
`
`52.
`
`The phase MAPTRI_d+ 524 is active for S2,
`
`the
`
`phase MAPTRI_d— 526 is active for 81 and image Z has
`
`been produced at step 620.
`
`With reference additionally now to Fig.
`
`6G,
`
`the
`
`fifth step in the computational process 650 is shown
`
`as followed by the continued downward propagation of
`
`shots 81 and 52 over all of the depth slices.
`
`The
`
`phase MAPTRI_x 520 is active for 81,
`
`the phase
`
`MAPTRI_d— 526 is active for 82 and image Z has been
`
`produced at step 620.
`
`with reference additionally now to Fig. 7A,
`
`a
`
`process 700 for performing a representative systolic
`
`wavefront operation in the form of a reservoir
`
`simulation function is shown which utilizes the
`
`parallelism available in the adaptive processing
`
`techniques of the present
`
`inVention.
`
`The process 700
`
`includes a “k” loop 702, “j" loop 704 and “1" loop 706
`as shown.
`
`With reference additionally now to Fig. 7B,
`
`the
`
`general computation of fluid flow properties in the
`
`reservoir simulation process 700 of the preceding
`
`figure are illustrated as values are communicated
`
`between a group of neighboring cells 710.
`
`The group
`
`of neighboring cells 710 comprises,
`
`in the simplified
`
`illustration shown, first,
`
`second and third walls of
`
`cells 712, 714 and 716 respectively.
`
`Each of the
`
`walls of cells includes a corresponding number of
`
`first, second,
`
`third and fourth rows 718, 720, 722 and
`
`10
`
`15
`
`2O
`
`25
`
`3O
`
`724 respectively.
`
`\\\<2:5 — 8040120018 - 56166 n].
`
`Petitioner Microsoft Corporation - Ex. 1002, p. 17
`
`Petitioner Microsoft Corporation - EX. 1002, p. 17
`
`

`

`16
`
`As shown,
`
`the computation of fluid flow
`
`properties are communicated to neighboring cells 710
`
`and,
`
`importantly,
`
`this computation can be scheduled to
`
`eliminate the need for data storage.
`
`In accordance
`
`with the technique of the present invention,
`
`a set of
`
`cells can reside in an adaptive processor and the
`
`pipeline of computation can extend across multiple
`adaptive processors. Communication overhead between
`
`multiple adaptive processors may be advantageously
`
`minimized through the use of MA Fm adaptive processor
`
`chain ports as disclosed in 0.8- Patent No. 6,339,819
`
`issued on January 15, 2002 for: “Multiprocessor With
`
`Each Processor Element Accessing O

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket