Chapter
`AdvancedSystolicDesign
`DominiqueLavenier
`PatriceQuinton
`SanjayRajopadhye
`IRISA,CampusdeBeaulieu
` RennesCedex
`France
`email:flavenier,quinton,rajopadhg@irisa.fr
`AbstractSystolicarraysarelocallyconnectedparallelarchitectures,whosestructure
`iswell-suitedtotheimplementationofmanyalgorithms,inscienti(cid:12)ccomputation,
`signalandimageprocessing,biologicaldataanalysis,etc.Thenatureofsystolic
`algorithmsmakesitpossibletosynthesizearchitecturessupportingthem,using
`correctnesspreservingtransformations,inatheoreticalframeworkthathasadeep
`relationshipwithloopparallelizationtechniques.Thisopensthewaytonewvery
`high-levelarchitecturesynthesistechniques,whichwillbeamajorstepinmastering
`theuseofICtechnologiesinthefuture.Thischapterhastwoparts.First,we
`presentthecurrentstateoftheartinsystolicsynthesistechniques,andthesecond
`partsurveysvariouswaysofimplementingsystolicalgorithmsandarchitectures
`usingprogrammablearchitectures,fpgas,ordedicatedarchitectures.
`. Introduction
`ThetermsystolicarrayswascoinedbyKungandLeisersonin tode-
`scribeapplicationspeci(cid:12)cvlsiarchitecturesthatwereregular,locallyconnected
`andmassivelyparallelwithsimpleprocessingelements(PEs).Theideaofusing
`suchregularcircuitswasevenpresentinvonNeuman'scellularautomatainthe
`(cid:12)fties,Hennie'siterativelogicarraysinthesixties,andalsoinspecializedarithmetic
`circuits(Lyon'sbit-serialmultiplier[ ]isclearlyalinearsystolicarray).However,
`theemergenceofvlsitechnologyinthelateseventiesandearlyeightiesmadethe
`timeripeforintroducingsucharchitecturesinordertohighlightthecharacteristics
`appropriatetothetechnology.
`Systolicarraysimmediatelycaughton,sincetheyinvolvedafascinatingin-
`terplaybetweenalgorithmandarchitecturedesign.Whenresearchersstartedin-
`
`
`Patent Owner Saint Regis Mohawk Tribe
`Ex. 2040, p. 1
`
`
`AdvancedSystolicDesign
`DominiqueLavenier
`PatriceQuinton
`SanjayRajopadhye
`IRISA,CampusdeBeaulieu
` RennesCedex
`France
`email:flavenier,quinton,rajopadhg@irisa.fr
`AbstractSystolicarraysarelocallyconnectedparallelarchitectures,whosestructure
`iswell-suitedtotheimplementationofmanyalgorithms,inscienti(cid:12)ccomputation,
`signalandimageprocessing,biologicaldataanalysis,etc.Thenatureofsystolic
`algorithmsmakesitpossibletosynthesizearchitecturessupportingthem,using
`correctnesspreservingtransformations,inatheoreticalframeworkthathasadeep
`relationshipwithloopparallelizationtechniques.Thisopensthewaytonewvery
`high-levelarchitecturesynthesistechniques,whichwillbeamajorstepinmastering
`theuseofICtechnologiesinthefuture.Thischapterhastwoparts.First,we
`presentthecurrentstateoftheartinsystolicsynthesistechniques,andthesecond
`partsurveysvariouswaysofimplementingsystolicalgorithmsandarchitectures
`usingprogrammablearchitectures,fpgas,ordedicatedarchitectures.
`. Introduction
`ThetermsystolicarrayswascoinedbyKungandLeisersonin tode-
`scribeapplicationspeci(cid:12)cvlsiarchitecturesthatwereregular,locallyconnected
`andmassivelyparallelwithsimpleprocessingelements(PEs).Theideaofusing
`suchregularcircuitswasevenpresentinvonNeuman'scellularautomatainthe
`(cid:12)fties,Hennie'siterativelogicarraysinthesixties,andalsoinspecializedarithmetic
`circuits(Lyon'sbit-serialmultiplier[ ]isclearlyalinearsystolicarray).However,
`theemergenceofvlsitechnologyinthelateseventiesandearlyeightiesmadethe
`timeripeforintroducingsucharchitecturesinordertohighlightthecharacteristics
`appropriatetothetechnology.
`Systolicarraysimmediatelycaughton,sincetheyinvolvedafascinatingin-
`terplaybetweenalgorithmandarchitecturedesign.Whenresearchersstartedin-
`
`
`Patent Owner Saint Regis Mohawk Tribe
`Ex. 2040, p. 1
`
`
`
`Chapter
`
`vestigatingautomaticsynthesis,athirdstreamjoinedthiscon(cid:13)uence,namelythe
`analysis,manipulationandtransformationofprograms.Someoftheearlydesigns
`areclassics.TheGuibasetal.arrayforoptimalstringparenthesization[]isone
`ofouralltimefavorites.
`Theearlyeightieswasaperiodofintenseactivityinthisarea.Alargenumber
`of\paperdesigns"wereproposedforawidevarietyofalgorithmsfromlinearalge-
`bra,graphtheory,searching,sorting,etc.Therewasalsomuchworkonautomatic
`synthesismethods,usingdependencyanalysis,space-timetransformationsofinner
`loops,andalsootherformalismssuchasrecurrenceequations.
`Then,thetechnologicalevolutionsinthelateeightiesseemedtoinvalidatethe
`assumptionsofthesystolicmodel,namelythat(i)locality,regularityandsimplicity
`wereprimordialforvlsiand(ii)anelementarycomputationcouldbeperformed
`inthesametimethatittooktoperformanelementarycommunication.The(cid:12)rst
`onemeantthatwiththeimprovedcadtoolsandincreasinglevelsofintegration,
`circuitdesignerswerenotobligedtoalwaysfollowthatdictatesofthesystolicmodel
`(whichwasitselfapplicabletoonlyapart|albeit,acomputationallysigni(cid:12)cant
`one|ofthecompleteapplication).Thesecondonemeantthatthesystolicspace-
`timemappingmethodswerenotdirectlyapplicableforgeneralpurposeparallel
`machineswherethecommunicationlatencywasanorderofmagnitudehigherthan
`computationspeed.
`Anumberofrecentdevelopmentsnowleadustobelievethatitistimeto
`revivethe(cid:12)eld.Firstofall,technologyhasevolved.Withthegrowinglevelsof
`integration,itisfeasibletoactuallyimplementanumberoftheold\paperde-
`signs",particularlyiftheelementaryoperationsarenot(cid:13)oatingpoint,butcome
`fromasimpleralgebraicstructure(suchassemi-ringswithadditionsandcompar-
`isons,etc.)Second,thecircuitsoftodayaremuchmorecomplex,andweneedto
`againquestiontheargumentsagainsttheregularity,localityandsimplicityofsys-
`tolicarrays.Cananirregularcircuitsachievebetterperformance?Willtheydoso
`reliably?Whatwillbethedesigntime?Athirdandveryimportantreasonisthe
`emergenceoffpgas,programmablelogicandrecon(cid:12)gurablecomputing.Thebasic
`technologyunderlyingfpgasisregularity,localityandsimplicity.Itisthereforenot
`surprisingthatsomeoftheimpressivesuccessesinthisdomainaresystolicdesigns.
`Fourth,thereisthegrowingunderstandingthatthetechniquesofsystolicsynthesis
`haveaclosebearingonautomaticloopparallelizationforgeneralpurposeparallel
`machines.Thesetechniquescanthusbeseenasveryhighlevelsynthesismethods,
`applicableforsoftwareaswellashardware,andthusformafoundationforcode-
`sign,foracertainclassofproblems.Coupledwiththefactthatrecentadvancesin
`integrationpermitoneormoreprogrammableprocessorcorestobeimplemented
`\on-chip",thismakessuchtechniqueswellsuitedforprovablycorrectcodesign.
`Finally,webelievethattheincreasingcomplexityofvlsisystemdesignnaturally
`leadstowardsformaldesignmethodsinvolvingcorrectness-preservingtransforma-
`tionofhigh-levelspeci(cid:12)cations.
`Thischapterpresentstwoaspectsofadvancedsystolicdesign,namelyad-
`vancedmethodsforsystolicarraydesign(Secs.{..),butalsodesigntechniques
`andcasestudiesofadvanced,stateoftheartsystolicarrays(Sec. ).
`Inthe(cid:12)rstpartweuseaformalismcalledsystemsofrecurrenceequations
`(sres)todescribeboththeinitialspeci(cid:12)cationaswellasthe(cid:12)nalarray.Theentire
`processofsystolicdesignisviewedastheapplicationofaseriesoftransformations
`
`Patent Owner Saint Regis Mohawk Tribe
`Ex. 2040, p. 2
`
`
`
`
`EmergingTechnologies
`totheinitialspeci(cid:12)cation,untiloneobtainsadescriptionofthe(cid:12)nalarray.We
`desirethatallthedetails,includingcontrol,I/O,interfaceetc.bespeci(cid:12)edinthe
`sameformalism,andthatasimpletranslationstepshouldyieldadescriptionthat
`issuitableforaconventionalcadtool(sayvhdl).Foreachtransformationweare
`concernedwithtwoaspects:(i)themanipulationofthesretoobtainaprovably
`equivalentsre((cid:18)ala`correctness-preserving'programtransformations),and(ii)the
`choiceofthetransformation,usuallywithaviewtooptimizingcertaincostcriteria.
`InSection.wepresentthebasicnotationsofrecurrenceequationsandtheir
`domains.InSection..wedescribethefoundationsofthe(cid:12)rstaspect,namelythe
`formalmanipulationofrecurrenceequations.Insections.. to..,weexplain
`howthesevarioustransformationscanbeusedtotransformaspeci(cid:12)cationinto
`anabstractarchitecture:section.. detailsschedulingtechniques,section..
`dealswithallocationofcomputationsonprocessors,and(cid:12)nally,section..is
`concernedwithlocalizationandserializationtransformations.
`Inthesecondpart,weillustratefourdi(cid:11)erentwaystoimplementsystolic
`arrays:general-purposeprogrammablearchitectures(Section. . ),application
`orientedprogrammablearchitectures(Section. .),recon(cid:12)gurablearchitectures
`(Section. . ),andspecial-purposearchitectures(Section. .),andweillustrate
`eachonewithonecasestudy.Althoughsomeofthedesignspresentedarenot
`new,aclosestudyoftheirdesignisinstructive.Asthetechnologyofintegrated
`circuitsisstillinfullmotion,webelievethatlessonshavetobelearnedfromthese
`examples,inordertotakethebestadvantageoffuturearchitecturalopportunities.
`.SystolicDesignbyRecurrenceTransformations
`SystemsofRecurrenceEquations(sres)playanimportantpartinthedesign
`process.Theyareusefulas(i)behavioraldescriptionsofthe(cid:12)nalarrays,andalso
`(ii)highlevelspeci(cid:12)cationsoftheinitialalgorithm.Forexample,considerthe
`systemofequationsgivenbelow.
`X(t;p)=<:fp= ;(cid:0)n+ (cid:20)t<