throbber

`
` Pat
`
`PAGE
`DOCUMENT-REV. NUM.
`EDIT DATE
`ORIGINATE DATE
`ft. 8
`1 of 54
`GEN-CXXXXX-REVA
`4 September, 20152
`24 September, 2001
`raya
`UY
`Laurent Lefebvre
`
`—-
`
`Author:
`
`
`Issue To:
`| Gopy No:
`
`R400 Sequencer Specification
`
`SQ
`
`
`
` AUTOMATICALLY UPDATED FIELDS:
`
`Version 2.019
`
`
`It provides an overview of the
`Overview: This is an archiectural specification for ihe R400 Sequencer block (SEQ).
`required capabilities and expected uses of the block.
`it also describes the block interfaces,
`internal sub-
`blocks, and provides internal stale diagrams.
`
`Document Location:
`C\perforce’ir400\doc_lib\designiblocksisq\R 400,Sequencer.doc
`Current Intranet Search Title:
`R400 Sequencer Specificetion
`
`
`|
`.
`Se
`"APPROVALS -
`Us
`
`Name/Dépt--
`ee
`a8
`Signature/Date
`
` fb
` Remarks:
`
`
`
`
`
`
`
`
` THIS DOCUMENT CONTAINS CONFIDENTIAL INFORMATION THAT COULD BE
`
`SUBSTANTIALLY DETRIMENTAL TO THE INTEREST OF ATI TECHNOLOGIES
`INC. THROUGH UNAUTHORIZED USE OR DISCLOSURE.
`
`
`
`“Copyright 2001, ATI Technologies Inc. All rights reserved. The material in this document constitutes an unpublished
`work created in 2001. The use of this copyright notice is intended to provide notice that ATI owns a copyright in this |:
`unpublished work. The copyright notice is not an admission that publication has occurred. This work contains
`confidential, proprietary information and trade secrets of ATI. No part of this document may be used, reproduced, or
`transmitted in any form or by any meanswithout the prior written permission of ATI Technologies Inc.”
`
`Exhibit 2020coch400_Sequencendes
`
`73711 Bytes*** © AT] Confidential. Reference Copyright Notice on Cover Page © =
`
`ATI 2029
`
`LGv. ATI
`IPR2015-00325
`
`AMD1044_0257395
`
`ATI Ex. 2108
`IPR2023-00922
`Page 1 of 316
`
`ATI Ex. 2108
`
`IPR2023-00922
`Page 1 of 316
`
`

`

`
`
`24 September, 20014September,201522 of 54
`
`
`ORIGINATE DATE
`
`
`
`
`
`EDIT DATE
`
`R400 Sequencer Specification
`
`PAGE
`
`
`
`Table Of Contents
`
`LOVERVIEW oo eesccecccscsesnsucseseuesevenseressuesenssssenensnsssesessenvnsssssnnsssusssuenenensssssasssnnsrssenenenesseseses oF
`Li
`Top Level Block Digan eee e cece eee tees e es ttittttntttentnttnnibinagieetitpitettaitstsusunanansnnnns 119
`:
`12 Data Flow graph (SP
`
`13
`Comtrol Graphene eee ee eee ee cece ee ee tpe tte ateteteeteat ats sesssesnensategitstssttetestetetusesessss 1344
`2.
`INTERPOLATED DATA BUS..
`... TG44
`
`3.
`INSTRUCTION STORE ...........
`see
`4.
`SEQUENCER INSTRUCTIONS... oo. eccccccceccseeeesesesennnnnnneesssnnenenestuinnnnenenngnnnnneneetsenennnnnents 1644
`5.
`CONSTANT STORES. oo. ccccccecccccce cece eeenen ce neeeennnn nese enaananteessnnnnnnssdsnanaanaanesssauuseenesessnseeeneces 1614
`S.1
` MGMOry OF AMIZALIONS ooo o cece eeeeeeeeeeseeestu ns ueununnane vitunuusstsssseeeyertunnusesetesseeceesers 1644 95
`
`5.2 Management of the Control FlowConstants V46 ee
`
`
`
`$.3__Management of the fe-mepping tables ooops en ttn aentaninuiniaaiasaies W346
`
`5.3.1
`R400 Constant Management once ceeccccecccsecesccssuscuscsssstuseussssusuussustsssustisussnisssiiiesiiss 1715
`5.3.2
`Proposal for R400LE constant management occ. cc ecccccsessesceccescuseesusssesutssuteseuss 4745
`
`5 BA Free List BlOCK oie cs cccesussesesuespututctsitsauasasnnatuntstsiutsesssdstusistssstsesatsimtsta 14% 0
`5.35
` De-allocate Block ooo cece eves ceseeseseessssnstussatsusntusnatsssnditsutsiatitsutstsstustusnststsustnsasice 2018
`5.3.6
`Operation of Incremental MODEL ccc cec cece ceceeeeeecseeseeteeevessesuseseesuessusensetssaseteessseses 2048
`
`$4 Constant Store INCeOxIN ee eee eee eee eee eA EE LEE EEEEEttniiisitiuinbenpeneebigensititiitits sass 20418
`
`
`$.5 Real Time Commands...
`we
`
`49oes
`
`$6 ConstantWaterfallingee 2149 ©
`
`
`LOOPING AND BRANCHES...
`eeae
`
`
`The controlling state...
`
`6.2
`The Control Flow PrOgrarm cicscssssssssssssesssssvssssssvssssovssstivassivsisnasvsssusiasnsssisssnn 2220 - oe
`
`
`6.2.1
`Control flow instructions table ooo... ccccceccecsccescussuscssessuseesussussussussetestussussstsususieciies 23B4
`
`
`GB
`leplermertatcance eevee eves ceee eves esveuuesysesussuusvusybesususssubauesuesssisisussysiussussupbesssessuvessssss gage
`64 Data dependant predicate instructions. 2624 92
`
`
`
`65 HW Detection of PVPS ooo eect tee eeeeebetenentetettiotitteeteteeseuneneeneenes Z2f24
`66 Register fle iCexing ooo occ cece cece ee ec cseseenees sisesstsntnesttennanunnpstunsessteteteseeeteteseteteuseseutes 2fZA - Bs
`
`
`
`the values intheGPRs 2625 =
`
`Method 7: Debugging registerscccett an 2IBD o ee
`Method 2: Exporting
`7.
`eees
`PIXEL KILL MASK oc eee eter eeeeneeenereernyen es
`
`MULTIPASS VERTEX SHADERS (HOS) .0..cccscsscsssscsossssssssssssssssssosssssesssessessterssssesessnvsse a
`8.___
`9.
`REGISTER FILE ALLOCATION. oo ccssssssssesssssssssssssssssssssssssssvevssssssvsssvessssvsvsssuitensassvesasnevens 2826
`FETCH ARBITRATION.........
`2927
`10.__
`
`
`
`2927
`Lt.__ ALU ARBITRATION...
`HANDLING STALLS vovcsccecsssesssscssssssssssvesssssessesvesese
`3028
`12.
`
`CONTENT OF THE RESERVATION STATION FIFOS..
`13.
`
`THE OUTPUT FILE cece sceteeseeecseeseeeeeessereneeeeseneees
`(4.
`
`lJ FORMAT we
`
`
`
`Exhibit 2029, dockuoo_Sequencerdoe
`
`73711 Byes*** © ATI Confidential. Reference Copyright Notice on Cover Page © +=
`
`AMD1044_0257396
`
`ATI Ex. 2108
`IPR2023-00922
`Page 2 of 316
`
`ATI Ex. 2108
`
`IPR2023-00922
`Page 2 of 316
`
`

`

`
`
`ORIGINATE DATE
`
`EDIT DATE
`
`DOCUMENT-REV. NUM.
`
`PAGE
`
`:
`
`3 of 54
`GEN-CXXXXX-REVA
`4 September, 20152
`24 September, 2001
`‘
`a
`fu
`LA i
`fy
`a
`sas
`THE PARAMETER CACHE oon ccccnccessseeceeesereennesntessutansnnnausssauannenautunmenusenssiennemassnsmennseases 333g
`17.
`171 ExportrestrictionSee eee BAB
`
`
`
`
`
`
`
`
`
`
`
`Arbitration restrictions oo eee eect eee e eect teeta bette etietettetttieettnetneenes 3439
`17.2
`EXPORT TYPES......
`3434
`18.
`
`
`VertexShading...
`346
`PIK@L SMC oo cece eee eee eee ete ette ett tthe te eEEEEEEEEHiAG AMD itbecietpeiteiisspeieupispisisissinessnes
`SPECIAL INTERPOLATION MODES..
`3534
`19.
`Real time commands
`3534
`19.1
`Sprites/ XY screen coordinates/ FB information... cece eee tee atte tsetse 3532
`19.2
`
`
`Auto generated COUNMIETSocc cece cece nee e neat vee eesti es set epnt tent stb ecbttesteseusestussenes 3632
` 19.3
`
`V9.3.) Vertex SRA ers oonnocee oe cence cesses eee o cesses peepee veto utes cititeetstepeiteeseces.., 8632
`
`
`
`19.3.2
`Pixel Shader. once vce
`vee
`ce
`eee ec see
`o eee eee teece ete tets
`costtttpseiecriveeesees
`es, BEB2
`20.
`STATE MANAGEMENTqu... cceccccccesenccsecesseeuseecessuuuseounsssusecenmnsussennnanavesesnnnaaantestnananacses 3623
`20.1
`Parameter cache SYNCHroniZation eee cece eee eee e eee tnnn nn ngetnasteteestnss 3633
`
`
`
`XY ADDRESS IMPORTS...
`3733
`21.
`
`
`
`2L1Vertexindexesimpor
`3733
`22.
`REGISTERS..
`
`
`22.1
`Control.
`
`
`
`22.2 Context.
`
`23. DEBUG REGISTERS... .....-cccccccssscesesseconnnnsssssnnnnsenseeennnensecassnensersunssansnsnnastanaesnnncstsananansttss 3835
`23.1 COMO cece cece eee eee eaneesssts-ceeetueesesueessenesssesste setts steteetessepsusuusetsteseteseestieeetretensse: 3835
`23.2 Control.
`3835
`
`
`24.
`INTERFACES...
`
`241
`External Interfaces.
`3
`242
`SC to SP Inleracesoe eee eee eee beet bb ntetnes bitisstttetetieeetutesntenistessitees 3835
`
`24.2.1aSCSPHoeceereesetpteptispntnabntnpitniigiinisinsiiinpwiiSOS
`QD 2—SC anna ccceecec ccc vcen cece suesveves sustains pussy ssuaisviessustessesstyssipesunesneyeensenssss 3936
`
`24.2.3
`
`SQ to SX: interpolator DUS once c ccc ccccceccccsesescuscsssesvesusesstusuuteesseeseesusevisvssssvse vis 4138
`
` 24.2.4
` SOQ to SP: Staging Register Data occ cssecscscscensnvsessvsvavannevsrpusstsvanveruunnvrtenvses 4138
`
`2A 25
` VGT to SQ. Vertex interface. ccc cecsesuescevesssetesuetussaueessestsevasiiivatnecaventisn 4138
`
`os
`esteess
`242.6
` S8Qto SX: Control bus...
`
`SX to SQ: Output file COMO cece se eseeeccseeeevosessssvsnevosnssssnsnesessesaupnessesenss 4544
`
`24.2.7
`QA 2B SO to TP: COmbrol BUS oes cece secs sencenvsvssanvssasnsvstasansausssssuvansetsssavisivansssanssesvanves 4642
`24.2.9
`TP to SQ: Texture Stal cccceccccssceucsscssapvavensssssasanvavsssssssunsayssvspisuanyisnvasssnesss 4642
`2A 2AO SQ to SP: Texture Stall ccc secsescsvenesussasossnssssuasasvavstustusunssusssasitssunessunuesssanvess 4742.
`242.11 SQ to SP: GPR and auto COUNT occ ese caeec seeeeee cess peenasstpsaetcetaevisatnintnsess 4743
`BA ZAZ SQ to SPInstructions one cecses eve ees e cee ssceeee tees pcptatasaetsuaesseitgetisusnuatnsests 4844
`242.13 SP to SQ: Constant address load/ Predicate Seto eee sve severe csusssrssersas 4844
`24.214 SO to SPx: constant broadeast oc ccc ususencssesnesssvsussunvenetsssazosenmnsssusstineseuns 4945
`
`Echiblt 2020. cdockd00_Sequencerdos
`
`73711 Byes*** © ATI Confidential. Reference Copyright Notice on Cover Page © +=
`
`AMD1044_0257397
`
`ATI Ex. 2108
`IPR2023-00922
`Page 3 of 316
`
`ATI Ex. 2108
`
`IPR2023-00922
`Page 3 of 316
`
`

`

`Fag 24September,2001
`4September,20122
`4of54
`
`
`
`
`
`
`
`
`ORIGINATE DATE
`EDIT DATE
`R400 Sequencer Specification
`PAGE a
`
`:
`
`SPO to SQ: Ki VECtOr lOocc cece cece cses ceva saeetesuespestesntensttutitennevsiteetnentnnets 4945
`24.2, 15
`24.216 SOQ to CP: RBBM DUS ooo ccesseussesuensssssansssnesssunsantvstisussnnnsvssansisssnnessunsnsssnsvess 4945
`242.17 CP to SQ: RBBM DUS. occ ccc ecco cecs eens ecssonssesse css tatnastastsututninttisansisstniessenessesisess 4945
`
`24,.2,18 SOQ to CP: State femora ete tates secetteatentetutueeuesstenengenepets 4945
`
` 4 -SEQUENCER-INSTRUCTIONS.,
`§.----
`CONSTANT STORES wecnveveeverveves
`
`
`5.31RA00.Sonstant-managementonsen
`
`
`33-2—Proposatfor-R400L.E-constant-management
`
`
`oy3-4—-Free-List Blo¢knner
`
`
`3.3-5-—-De-allecate- Bloekrnnnaea
`
`3-3-6. Operation-ofIncrementalmodel...
`4——-Gonstant Store-_indexing
`
`$3Real-Time-Commands..
`
`
`
` 62 The Control Flow Program oo...
`
`1H—ALU-ARBITRATION.
`12.-HANDLING STALLS essvseeensnnnensnrrenenrerrnonnneerinnenrernss
`
`
`
`i—-1J-FOR MASensspinner aurearennerurine tina ee
`
`Exhibit 2020. dockdoo_Sequencecdoe
`
`73711 Byes*** © ATI Confidential. Reference Copyright Notice on Cover Page © +=«
`
`:
`
`AMD1044_0257398
`
`ATI Ex. 2108
`IPR2023-00922
`Page 4 of 316
`
`ATI Ex. 2108
`
`IPR2023-00922
`Page 4 of 316
`
`

`

`
`
`24 September, 20014
`
`.
`
`4 September, 20152
`i
`SLA
`i
`
`
`
`
`rat
`
`
`ORIGINATE DATE
`EDIT DATE
`DOCUMENT-REV. NUM.
`PAGE
`
`
`
`GEN-CXXXKX-REVA
`
`5 of 54
`
`SG16-S%-Interpolater-pus-—.-
`SG1e-SF:-Staging-Register BalessseeeS
`VGT to SQ > Vertex INterace.. cece cece ee ect eteeeettreeteetttssetenttteecenenies 38
`8Q10-SX:-COnU OL BUSsree 44
`
`Sx-1o-SQ-Outpul fle contolseAF
`SG to-TP:-Contre| Bu enrnrererererreererrrrrrerrenerrrrrrrrrrerrerrrrrrerrrrerrreerrreerrerrrrerrre ee
`+TPto-SQ:—-Texture- Stall enneereeerenrrrererennreereerrreerrerrrrererrrrererrrerrrererrereerrererrrererereethe
`~-8G-te-SP:-Lexture- Stalrrererrennnrrrererenneeernrrenrreererrreerenrererreeeeerrereermerererreer he
`
`ar
`SQ 1o-SP:GPR-and-auto-counter.
`~-SQ-to-SPx:-ISltuGll ONS nnneceer eres rnrrrrrrrrrrrennnrerrrrrrsrnrrrrirrrrsrrre 44
`
`Echiblt 2020. cdockd00_Sequencerdos
`
`73711 Byes*** © ATI Confidential. Reference Copyright Notice on Cover Page © +=
`
`AMD1044_0257399
`
`ATI Ex. 2108
`IPR2023-00922
`Page 5 of 316
`
`ATI Ex. 2108
`
`IPR2023-00922
`Page 5 of 316
`
`

`

`
`
`
`
`
`
`
`R400 Sequencer Specification
`
`PAGE
`6 of 54
`
`|
`
`ORIGINATE DATE
`24 September, 2001
`
`EDIT DATE
`lat
`fay
`ih
`4 September, 20152
`
`27-2-4+3---SP-to SQ-Constant-address-load/-Predicate-Set.44
`
`Exhibit 2020. dockdoo_Sequencecdoe
`
`73711 Byes*** © ATI Confidential. Reference Copyright Notice on Cover Page © +=«
`
`AMD1044_0257400
`
`ATI Ex. 2108
`IPR2023-00922
`Page 6 of 316
`
`ATI Ex. 2108
`
`IPR2023-00922
`Page 6 of 316
`
`

`

`
`
`
`Revision Changes:
`
`ORIGINATE DATE
`
`24 September, 2001
`
`Rev 0.1 (Laurent Lefebvre)
`Date: May 7, 2001
`
`Rev 0.2 (Laurent Lefebvre)
`Date : July 9, 2001
`Rev 0.3 (Laurent Lefebvre)
`Date : August 6, 2001
`Rev0.4 (Laurent Lefebvre)
`Date : August 24, 2001
`
`Rev 0.5 (Laurent Lefebvre)
`Date : September 7, 2001
`Rev 0.6 (Laurent Lefebvre)
`Date : September 24, 2001
`Rev0.7 (Laurent Lefebvre)
`Date : October 5, 2001
`
`Rev 0.8 (Laurent Lefebvre)
`Date : October 8, 2001
`Rev 0.9 (Laurent Lefebvre)
`Date : October 17, 2001
`
`Rev 1.0 (Laurent Lefebvre)
`Date : October 19, 2001
`Rev 1.1 (Laurent Lefebvre)
`Date : October 26, 2001
`
`Rev 1.2 (Laurent Lefebvre)
`Date : November 16, 2001
`Rev 1.3 (Laurent Lefebvre)
`Date : November 26, 2001
`Rey 1.4 (Laurent Lefebvre)
`Date : December 6, 2001
`
`Rev 1.5 (Laurent Lefebvre)
`Date : December 11, 2001
`
`Rev 1.5 (Laurent Lefebvre)
`Date : January 7, 2002
`
`Rev 1.7 (Laurent Lefebvre)
`Date : February 4, 2002
`Rev 1.8 (Laurent Lefebvre)
`Date : March 4, 2002
`
`Rev 1.9 (Laurent Lefebvre)
`Date : March 18, 2002
`Rev 1.10 (Laurent Lefebvre)
`Date : March 25, 2002
`Rev 1.11 (Laurent Lefebvre)
`Date : Apri] 19, 2002
`Rev 2.0 (Laurent Lefebvre)
`Date : April 19, 2002
`
`EDIT DATE
`
`A
`
`4 Sepiember, 20152ieA A
`
` DOCUMENT-REV. NUM.
`
`GEN-CX200X-REVA
`
`PAGE
`7 of 54
`
`
`
`First draft.
`
`Changed the interfaces to reflect the changesin the
`SP. Added some details in the arbitration section.
`Reviewed the Sequencer spec after the meeting on
`August 3, 2001.
`Added the dynamic allocation method for register
`file and an example (written in part by Vic) of the
`flow of pixels/vertices in the sequencer.
`Added timing diagrams (Vic)
`
`the new R400
`
`reflect
`spec to
`Changed the
`architecture. Added interfaces.
`instruction
`Added
`constant
`store management,
`store management, control flow management and
`data dependant predication.
`Changed the control
`flow method to be more
`flexible. Also updated the external interfaces.
`Incorporated changes made in the 10/18/01 contro/
`flow meeting. Added a NOP instruction, removed
`the
`conditional_execute_or_jump. Added
`debug
`registers.
`Refined interfaces to RB. Added state registers.
`
`della
`interfaces. Changed
`SEQ-—-SPOQ
`Added
`precision. Changed VGT-SP0 interface. Debug
`Methods added.
`Interfaces greatly refined. Cleaned up the spec.
`
`Added the different interpolation modes.
`
`Added the auto incrementing counters. Changed
`the VGT--SQ interface. Added content on constant
`management. Updated GPRs.
`Removed from the spec all interfaces that werer’t
`directly tied to the SQ. Added explanations on
`constant
`management.
`Added
`PA--SQ
`synchronization fields and explanation.
`Added more details on the staging register. Added
`detail about
`the parameter caches. Changed the
`call
`instruction to a Conditionnal_call
`instruction.
`Added
`details
`on
`constant management
`and
`updated the diagram.
`in the SX
`Added Real Time parameter control
`interface. Updated the control flow section.
`Newinterfaces to the SX block. Added the end of
`clause modifier,
`removed the
`end
`of clause
`instructions.
`Rearangement of the CF instruction bits in order to
`ensure byte alignement.
`Updated the interfaces and added a section on
`exporting rules.
`Added CP state report interface. Last version of the
`spec with the old control flow scheme
`Newcontrol flow scheme
`
`Exhibit 2020.dock400_Sequencerdes
`
`79711 Bytes*** © ATI Confidential. Reference Copyright Notice on Cover Page © +=
`
`AMD1044_0257401
`
`ATI Ex. 2108
`IPR2023-00922
`Page7 of 316
`
`ATI Ex. 2108
`
`IPR2023-00922
`Page 7 of 316
`
`

`

`
`
`ORIGINATE DATE
`EDIT DATE
`R400 Sequencer Specification
`PAGE
`
` | 24 September, 2001 4 September, 20152 8 of 54
`oy A ih
`Zi.
`
`
`Rev 2.01 (Laurent Lefebvre)
`Changed slightly the control
`flow instructions to
`Dele : May 2. 2002
`alowforce jumos and calls.
`
`Exhibit 2020. dockdoo_Sequencecdoe
`
`73711 Byes*** © ATI Confidential. Reference Copyright Notice on Cover Page © +=«
`
`AMD1044_0257402
`
`ATI Ex. 2108
`IPR2023-00922
`Page 8 of 316
`
`ATI Ex. 2108
`
`IPR2023-00922
`Page 8 of 316
`
`

`

`
`
`
`i
`
`24 S5eptember, 2001
`
`Bmw
`i
`SA sh oe
`4 September, 20152
`
` |
`
`|
`ORIGINATE DATE
`EDIT DATE
`DOCUMENT-REV. NUM.
`PAGE
`
`GEN-CXXKKXK-REVA
`
`L
`|
`
`9 of 54
`
`1. Overview
`The sequencer chooses two ALU threads and a fetch hread to execute, and executes all of the instructions in a block
`before looking for a new clause of the same type. Two ALU threads are executed interleaved to hide the ALU latency.
`The arbitrator will give priority to older threads. There are two separate reservation stations, one for pixel vectors and
`one for vertices vectors. This way a pixel can pass a vertex and a vertex can pass a pixel.
`
`To support the shader pipe the sequencer also contains the shader instruction cache, constant store, contro! flow
`constants and texture state. The four shader pipes also execute the same instruction thus there is only one
`sequencer for the whole chip.
`
`The sequencer first arbitrates between vectors of 64 vertices that arrive directly from primitive assembly and vectors
`of 16 quads (64 pixels) that are generated in the scan converter.
`
`The vertex or pixel program specifies how many GPRsit needs to execute. The sequencer will not start the next
`vector until the needed space is available in the GPRs.
`
`Exhibit 2028 dockdoG_Sequencerdes
`
`73711 Bytes*** © ATL Confidential. Reference Copyright Notice on Cover Page © +=
`
`AMD1044_0257403
`
`ATI Ex. 2108
`IPR2023-00922
`Page 9 of 316
`
`ATI Ex. 2108
`
`IPR2023-00922
`Page 9 of 316
`
`

`

`
`
`
`
`TWIRELVANeaddoAATLOaLOdd
`
`
`
`AqoVvd
`
`¥S3°OL
`
`
`
`
`
`uoijeayioadsuaouenbesoor
`
`alvdLids
`
`SaLVGSLYNISIYO
`
`
`
`
`
`uvassouori
`
`
`
`
`
`-SP1SQYAOEy)TOMLNOD
`
`|
`
`os
`
`
`
`
`
`
`
` aaiNiaaNi|*_4;
`
`
`
`
`
`x-GCia
`dsdsds
`
`gO/Od-g0/0d|FO/Od
`
`ia
`
`Wi"
`
`
`Wield
`
`
`
`
`
`
`sores}
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
` aada|
`44
`
`
`
`
`
`MOTAIOAGJosuonbag[eisuey3]oansly
`
`
`
`INWISHOOD
`
`OVO)
`
`EGLOsequigjaesp
`
`
`LO0Z‘“Iequiaydespz
`
`
`
`=TERIERYeT
`
`XELLUSA
`
`“OELNOO
`
`SINVLISNOO|peddeyy
`
`
`
`49imsiéey
`
`moLSEiP-
`
`peey2d
`
`
`
`
`
`
`
`axe@860gJ9AODUOBOONJUGUAdODsousioJey"]eENUEPYUOD[Ly@8sbez
`
`
`
`
`
`
`
`
`
` sopussuenbag"pppTFBI
`&SLVLSHOLSd
`
`
`AYOLSLSNI
`
`
`
`
`
`
`
`O62")IUBISUOT
`
`AMD1044_0257404
`
`ATI Ex. 2108
`IPR2023-00922
`Page 10 of 316
`
`ATI Ex. 2108
`
`IPR2023-00922
`Page 10 of 316
`
`
`
`
`
`
`
`
`
`
`
`

`

`
`
`
`
`
`
`|
`|
`
`ORIGINATE DATE
`24 S5eptember, 2001
`
`
`
`EDIT DATE
`4 September, 20152
`Hew
`Oi
`QO Areeit ey
`
`DOCUMENT-REV. NUM.
`GEN-CXXAXX-REVA
`
`PAGE
`11 of 54
`
`|
`
`1.1 Top Level Block Diagram
`
`>—— InputArbiter _
`
`
`
`esSe ee,
`
`
`
`PIX RS
`
`-+—
`
`
`
`
`
`|
`
`:—r
`
`VIX RS
`
`Exec Arbiter
`
`
`
`
`
`ALU
`
`Texture —
`
`Figure 2: Reservation stations and arbiters
`Under this new scheme, the sequencer (SQ) will only use one global state management machine per vector type
`(pixel, vertex) that we call the reservation station (RS).
`
`Exhibit 2028 dockdoG_Sequencerdes
`
`73711 Bytes*** © ATL Confidential. Reference Copyright Notice on Cover Page © +=
`
`AMD1044_0257405
`
`ATI Ex. 2108
`IPR2023-00922
`Page 11 of 316
`
`ATI Ex. 2108
`
`IPR2023-00922
`Page 11 of 316
`
`

`

` |
`
`ORIGINATE DATE
`
`EDIT DATE
`
`4 September, 20152
`A
`PA A
`it
`
`R400 Sequencer Specification
`
`PAGE
`12 of 54
`
`|
`24 September, 2001
`1.2 Data Flow graph (SP)
`
`|
`
` (
`
`
`
`pst
`address
`
`
`
`
`
`
`
`
`texture
`
`
`
`
`
`r
`
`
`
`RegisterFile
`
`
`
`
`
`texture| quest
`
`
`
`
`
`
`texture rel
`
` \
`
`/
`
`Register File
`
`
`
`
`
`
`
`instruction
`
`
`
`instruction
`
`JLleealrnpulout
`
` tel fre rea
` el
`
`
`
`
`
`
`
`instruction
`
`text aa
`
`ScalarUnit
`
`instruction
`1i
`
`Byepsinyxey~
`
`
`
`ByWO)BlepSAW
`
`catrapuale)
`
` |
`pipeline stage
`
`
`|
`pipeline stage
`
`
`
`f
`RegisterFile
`
`scalar inputfoutput
`pipeline
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`~
`
`ae\
`itive Assembly Unit or Render!Backend
`f a EI
`r
`\
`to Prim
`v
`
`Exhibit 2029 decR400_Sequercerdoc
`
`73711 Byes*** @ ATI Confidential. Reference Copyright Notice on Cover Page © ++
`
`Figure 3: The shader Pipe
`
`AMD1044_0257406
`
`ATI Ex. 2108
`IPR2023-00922
`Page 12 of 316
`
`ATI Ex. 2108
`
`IPR2023-00922
`Page 12 of 316
`
`

`

`
`
`
`
`
`
`|
`
`|
`
`ORIGINATE DATE
`
`24 September, 2001
`
`
`
`EDIT DATE
`
`4 September, 20152
`few Ov ihr
`
`DOCUMENT-REV. NUM.
`
`GEN-CXXKXKX-REVA
`
`PAGE
`
`13 of 54
`
`The gray area represents blocks that are replicated 4 times per shader pipe (16 times on the overall chip).
`
`1.3 Control Graph
`
`is
`
`CST
`
`SEQ
`
`|
`|
`||
`|
`|
`|
`|
`i
`|
`i
`|
`
`Lo
`|
`|
`
`-
`
`WrAddr
`
`Be
`
`Ciause # + Rady
`
`WrAddr
`cMD
`
`cst
`
`
`
`| A
`D4
`Phase|
`emp CSTestzestipx 4
`C Wrvec |
`Do | WrScal race
`“
`2 a
`
`RdAdar
`
`3
`
`FETCH
`
`SPO
`
`Re
`
`OF
`
`WrAddr
`
`Figure 4: Sequencer Control interfaces
`
`in red the ALU control interface, in blue the Interpolated/Vector
`In green is represented the Fetch control interface,
`control interface and in purple is the output file contro! interface.
`
`2. Interpolated data bus
`The interpolators contain an lJ buffer to pack the information as much as possible before writing it to the register file.
`
`Exhibit 2028 dockdoG_Sequencerdes
`
`73711 Bytes*** © ATL Confidential. Reference Copyright Notice on Cover Page © +=
`
`AMD1044_0257407
`
`ATI Ex. 2108
`IPR2023-00922
`Page 13 of 316
`
`ATI Ex. 2108
`
`IPR2023-00922
`Page 13 of 316
`
`

`

`
`ORIGINATE DATE
`EDIT DATE
`PAGE
`14 of 54
`
`24 September, 2001
`
`4 September, 20152
`fu
`CLA
`wif
`
`R400 Sequencer Specification
`
`ToRB
`
`— _
`!
`AQ
`
`i
`At
`
`
`
`
`
`
`
`Bt
`co
`ot
`c2
`
`
`C5
`
`pe
`
`XYsbuffer (ping-pongbuffer)
`24 bits * 16 quads * 2
`768 bits
`saad
`
`C3
`
`c4
`
`cS
`
`bo
`
`2
`
`3
`
`4
`
`cs C4
`!
`|
`
`DI
`
`ba
`
`
`
`
`
`
`Ag its*2 (15) + 8 bits * 6 Getta s)r4 oFAt A2 BO /
`
`AD
`At
`AZ
`BO
`i
`bits*6)* 16 (quads) * 2 (double-butfered)
`
`4096 bits
`/
`32x 128
`Bt
`co
`ci
`c2
`i
`
`|
`|
`|
`1
`E1
`EO
`|
`i
`Dt
`D2
`EO
`EI
`|
`i
`_
`T
`T
`
`1
`T
`|
`|
`!
`t
`INTERPOLATORS
`
`FIX-FLOAT + EXPANSION
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`|
`
`|
`|
`
`L
`™
`J
`
`L i
`
`|
`
`ju
`|
`
`——
`
`
`
`Ih
`
`
`
`512
`
`|
`mm.
`my
`rf
`|
`|
`
`p — fj |
`
`valfen] Tul |an|a] oa
`
`
`
`|
`|
`|
`
`
`II I
`4uR |
`|
`fLE | LL |
`
`Figure 5: interpolation buffers
`
`Exhibit 2029 dockd00_Sequencerdac
`
`73711 Byes*** © ATI Confidential. Reference Copyright Notice on Cover Page © +=«
`
`AMD1044_0257408
`
`ATI Ex. 2108
`IPR2023-00922
`Page 14 of 316
`
`ATI Ex. 2108
`
`IPR2023-00922
`Page 14 of 316
`
`

`

`
`
`
`
`L2lOCL/GIL
`
`
`
`
`
`SLL
`
`
`
`
`
`ellbb
`
`
`OlL
`
`
`
`
`
`weiderpSuppuonepdasqUy:oBML]
`
`
`
`
`
`
`
` TeeeeCAITETFCAI
`
`
`
`
`
`
` I91a
`
`
`dovd
`
`¥G}OSL
`
`
`
`WON(AdaLNSWNOOd
`
`
`
`aLvd1103
`
`
`
`
`
`
`
`TWIRELVANeaddoAATLOaLOdd
`
`
`
`WAREXXXXXO-NAD
`
`
`EGLOJeqQueyaesy
`
`L00z‘lequieydes
` aivday
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`see@OHB_J9A05UOBOONWYUGUAdODsoudIIJON“|EHUSPYUSD[LY@vaHe
`
`
`
`
`
`
`
`
`
`
`
`sopisseanbes“ggpysop6702Tay
`
`AMD1044_0257409
`
`ATI Ex. 2108
`IPR2023-00922
`Page 15 of 316
`
`ATI Ex. 2108
`
`IPR2023-00922
`Page 15 of 316
`
`
`
`

`

`
`
`
`
`
`
`R400 Sequencer Specification
`
`
`
`
`
`
`PAGE
`EDIT DATE
`ORIGINATE DATE
`16 of 54
`4 September, 20152
`24 September, 2001
`a
`B.
`Above is an example of a tile the sequencer might receive from the SC. The write side is how the data get stacked
`into the XY and IJ buffers, the read side is how the data is passed to the GPRs. The IJ information is packed in the IJ
`buffer 4 quads at a time or two clocks. The sequencer allows at any given time as many as four quadsto interpolate a
`parameter. They all have to come from the sameprimitive. Then the sequencer controls the write mask to the GPRs
`to write the valid data in.
`
`Instruction Store
`3.
`There is going to be only oneinstruction store for the whole chip. It will contain 4096 instructions of 96 bits each.
`
`It is likely to be a 1 port memory; we use 1 clock to load the ALU instruction, 1 clocks to load the Fetch instruction, 1
`clock to load 2 control flow instructions and 1 clock to write instructions.
`
`The instruction store is loaded by the CP thru the register mappedregisters.
`
`The VS_BASE and PS_BASE context registers are used to specify for each context where its shader is in the
`instruction memory.
`
`For the Real time commandsthe story is quite the same but for some small differences. There are no wrap-around
`points for real time so the driver must be careful not to overwrite regular shader data. The shared code (shared
`subroutines) uses the same path as real time.
`
`4, Sequencer Instructions
`All control flow instructions and move instructions are handled by the sequencer only. The ALUs will perform NOPs
`during this time (MOV PV,PV, PS,PS) if they have nothing else to do.
`
`5. Constant Stores
`
`5.1 Memory organizations
`A likely size for the ALU constant store is 1024x128 bits. The read BW from the ALU constant store is 128 bits/clock
`and the write bandwidth is 32 bits/clock (directed by the CP bus size not by memory ports).
`
`The maximum logical size of the constant store for a given shader is 256 constants. Or 512 for the pixel/vertex shader
`pair. The size of the re-mapping table is 128 lines (each line addresses 4 constants). The write granularity is 4
`constants or 512 bits.
`It takes 16 clocks to write the four constants. Real time requires 256 lines in the physical
`memory (this is physically register mapped).
`
`The texture state is also kept in a similar memory. The size of this memory is 320x96 bits (128 texture states for
`regular mode, 32 states for RT). The memory thus holds 128 texture states (192 bits per state). The logical size
`exposes 32 different states total, which are going to be shared between the pixel and the vertex shader. The size of
`the re-mapping table to for the texture state memory is 32 lines (each line addresses 1 texture state lines in the real
`memory). The CP write granularity is 1 texture state lines (or 192 bits). The driver sends 512 bits but the CP ignores
`the top 320 bits.
`It thus takes 6 clocks to write the texture state. Real time requires 32 lines in the physical memory
`(this is physically register mapped).
`
`The control flow constant memory doesn’t sit behind a renaming table. It is register mapped and thus the driver must
`reload its content each time there is a changein the control flow constants. Its size is 320*32 because it must hold 8
`copies of the 32 dwords of contro! flow constants and the loop construct constants must be aligned.
`
`The constant re-mapping tables for texture state and ALU constants are logically register mapped for regular mode
`and physically register mapped for RT operation.
`
`Exhibit 2020 deckd0G_Sequeneerdoe
`
`73711 Bytes*** © ATI Confidential. Reference Copyright Notice on Cover Page © =
`
`AMD1044_0257410
`
`ATI Ex. 2108
`IPR2023-00922
`Page 16 of 316
`
`ATI Ex. 2108
`
`IPR2023-00922
`Page 16 of 316
`
`

`

`
`
`
`
`
`DOCUMENT-REV. NUM.
`EDIT DATE
`ORIGINATE DATE
`|
`GEN-CXXXXX-REVA
`4 September 20152
`24 September, 2001
`|
`hy
`2 Management of the Control Flow Constants
`The control flow constants are register mapped, thus the CP writes to the according register to set the constant, the
`SQ decodes the address and writes to the block pointed by its current base pointer (CF_VWWR_BASE). Onthe read
`side, one level ofindirection is used. A register (SQ_CONTEXT_MISC.CF_RD_BASE) keeps the current base pointer
`to the control flow block. This register is copied wheneverthere is a state change. Should the CP write to CF afler the
`state change, the base register is updated with the (current pointer number +1 )% number of states. This way, if the
`CP doesn't write to CF the state is going to use the previous CF constants.
`
`PAGE
`17 of 54
`
`|
`
`5.3 Managementof the re-mapping tables
`
`5.3.1 R400 Constant management
`The sequencer is responsible to manage two re-mapping tables (one for the constant store and one for the texture
`state). On a state change (by the driver), the sequencerwill broadside copy the contents ofits re-mapping tables to a
`new one. We have 8 different re-mapping tables we can use concurrently.
`
`The constant memory update will be incremental, the driver only need to update the constants that actually changed
`between the two state changes.
`
`For this model to work in its simplest form, the requirement is that the physical memory MUSTbeat least twice as
`
`large as the logical address space + the space allocated for Real Time. In our case, since the logical address space
`ig 512 and the reserved RT space can be up to 256 entries, the memory must be of sizes 1280 and above. Similarly
`the size of the texture store must be of 32*2+32 = 96 entries and above.
`
`5.3.2 Proposal for R400LE constant management
`To make this scheme work with only 512+256 = 768 entries, upon reception of a CONTROL packetof state + 1, the
`sequencer would check for SQ_IDLE and PA_IDLE and if both are idle willerase the content of statefo replace it‘with
`the newstate (this is depicted in Figure 8: De-allocation mechanism}
`
`allecation-mechaniem). Note that in the case a state is cleared a value of 0 is written to the corresponding de-
`allocation counter location so that when the SQ is going to report a state change, nothing will be de-allocated upon
`thefirst report.
`
`The second path sets all context dirty bits that were used in the current state to 1 hus allowing the newstate to
`reuse these physical addressesif needed).
`
`Exhibit 2028 dockdoG_Sequencerdes
`
`73711 Bytes*** © ATL Confidential. Reference Copyright Notice on Cover Page © +=
`
`AMD1044_0257411
`
`ATI Ex. 2108
`IPR2023-00922
`Page 17 of 316
`
`ATI Ex. 2108
`
`IPR2023-00922
`Page 17 of 316
`
`

`

`
`
`
`
`
`
`Global Register
`Data Bus
`
`'
`Constants
`location <——_—_
`available
`i
`WRTR
`
`
`
`f
`
`
`
`
`
`Physical
`Memory
`
`>
`
`|
`
`Staging Data
`Buffer
`|
`|
`i .
`i
`— Staging Write Addr|
`
`next
`physical
`address
`ready
`for allocate
`
`physical
`address
`to
`
`schedule
`for
`
`|
`deallac
`|
`
`i
`Logical address
`Onthe peNN
`GlbRegBus
`_4 a |
`_
`aA
`when Ish are zero
`This
`!
`
`Context
`first word of write
`,
`|
`|
`Dirt
`Renaming Table
`|
`for 1 Context
`yy
`|
`i
`Current/Last
`Logical
`|
`i
`Physical
`_L
`Address
`Address
`| Address
`|
`(Only
`er
`|
`| ditset
`|
`de-
`Le ‘cal
`|
`|
`don't
`
`
`
`Address|allocate allocate -—____
`
`
`ifset)
`or de
`|
`|
`
`allocate)|
`|
`|
`Renaming
`
`
`:
`table
`N-Contexts
`
`
`
`PAGE
`R400 Sequencer Specification
`EDIT DATE
`ORIGINATE DATE
`
` 24 September, 2001 4 September, 20152 18 of 54
`
`oy A ih
`Zi.
`
`Free List
`_
`sities>
`
`
`
`| Renaming Table
`
`Context 0 => N
`
`~CurrenvLast|| |
`
`
`Context
`i
`
`
`(8 rows of 16-8|| eri ;
`
`
`bit physical =>
`"
`Logical Address
`
`
`128 entries copy
`
`
`in eight clocks)
`& Context
`
`
`
`
`
`
`
`
`
`
`Physical
`Address
`
`
`
`
`
`~, Seq
`Constant
`Request
`
`Context &
`Logical
`Address —]
`
`
`
`
`
`Copy Last held above to
`Current Context onreceipt
`of Set Constant for a
`newcontext (Hide loading
`behind Set State load - 16 clocks)
`all cther Set States just write one
`entry te current state.
`
`Exhibit 2020. dockdoo_Sequencecdoe
`
`73711 Byes*** © ATI Confidential. Reference Copyright Notice on Cover Page © +=«
`
`Figure 7: Constant management
`
`AMD1044_0257412
`
`ATI Ex. 2108
`IPR2023-00922
`Page 18 of 316
`
`ATI Ex. 2108
`
`IPR2023-00922
`Page 18 of 316
`
`

`

`
`
`ORIGINATE DATE
`24 September, 2001
`
`
`EDIT DATE
`4 September, 20152
`“
`by
`
`Bu
`
`DOCUMENT-REV. NUM.
`GEN-CXXXXKX-REVA
`
`PAGE
`19 of 54
`
`|
`i
`
`ADDR
`
`
`SQ_STATE#
`
`
`
`
`
`
`
`
`
`
`|
`|
`
`
`
`_
`DEALOC
`i—WRITE_ENABLE
`|
`
`Free List CNT VALUE|COUNTERS - 5
`|
`|
`| [| |
`PREVIOUS
`
`i NOT lal
`STATE
`|
`|
`
`|
`NEW
`|
`STATE
`|
`|
`VALUE | |
`|
`|
`|
`——— I=
`| he ~<
`
`|
`
`/|
`oR
`
`|;
`
`:
`
`VALID
`
`|
`
`Cee
`
`
`L
`:
`|
`SQ IDLE
`—— AND }
`PA_IDLE
`CP_NEW_STATE_CNTL—
`SET CTX BITS
`
`
`
`
`
`
`Figure $: De-allocation mechanism for R400LE.
`
`5.3.3 Dirty bits
`Two sets of dirty bits will be mainiained per logical address. The first one will be set to zero on reset and set when
`the logical address is addressed. The second one will be set to zero whenever a newcontext is written and set for
`each address written while in this context. The reset dirty is not set, then writing to that logical address will not
`require de-allocation of whatever address stored in the renaming table.
`If itis set and the contextdirty is not set, then
`the physical address store needs to be de-allocated and a new physical address is necessary to store the incaming
`data.
`If they are both set, then the data will be written into the physical address held in the renaming for the current
`logical address. No de-allocation or allocation takes place. This will happen when the driver does a set constant
`twice to the same logical address between context changes. NOTE:
`It is important to detect and preventthis, failure
`to do it will allow multiple writes to allocate all physical memory and thus hang because a context will not fit for
`rendering to start and thus free up space.
`
`5.3.4 Free List Block
`A free list block that would consist of a counter (called the IFC or Initial Free Counter) that would reset to zero and
`incremented every time a chunk of physical memory is used until they have all been used once. This counter would
`be checked eachtime a physical block is needed, andif the original ones have not been used up, us a new one, else
`check the free list for an available physical block address. The count is the physical address for when getting a
`chunk from the counter.
`Storage of a free list big enough to store all physical block addresses.
`Maintain three pointers for the free list that are reset to zero. The first one we will call write_ptr. This pointer will
`identify the next location to write the physical address of a block to be de-allocated. Note: we can never free more
`physical memory locations than we have. Once recording address the pointer will be incremented to walk thefreelist
`like a ring.
`The second painter will be called stop_ptr. The stop_ptr pointer will be advanced by the number of address chunks
`de-allocates when a context fini

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket