throbber
dS
`
`
`
`R400 Sequencer Specification
`
`SQ
`
`Version 4442.0
`
` AUTOMATICALLY UPDATED FIELDS:
`
`It provides an overviewof the
`Overview: This is an architectural specification for the R400 Sequencer block (SEQ).
`required capabilities and expected uses of the block,
`It also describes the block interfaces,
`internal sub-
`blocks, and provides internal stete diagrams.
`
`|.
`|
`
`ATT 2028
`
`LGv. ATI
`IPR2015-00325
`
`AMD1044_0017308
`
`ATI Ex. 2011
`IPR2023-00922
`Page 1 of 58
`
`Decument Location:
`Current intranet Search Title:
`we
`~Name/Dept
`
`perforcer400\doc_libwesigniblocksiegiR400Sequencer.dec
`R400 Sequencer Specification
`_ APPROVALS
`|
`
`oe
`Signature/Date
`
`
`
` Remarks:
`
`THIS DOCUMENTCONTAINS [RRNFORMATION THAT COULD BE
`
`
`SUBSTANTIALLY DETRIMENTAL TO THE INTEREST OF ATI TECHNOLOGIES
`INC. THROUGH UNAUTHORIZED USE OR DISCLOSURE.
`
`
`
`“Copyright 2001, ATI Technologies Inc. All rights reserved. The material in this document constitutes an unpublished
`work created in 2001. The use of this copyright notice is intended to provide notice that ATI owns a copyright in this
`npublished work. The copyright notice is not an admission that publication has occurred. This work contains
`EEroprictary information and trade secrets of ATI. No part of this document may be used, reproduced, or
`transmitted in any form or by any means without the prior written permission of ATI Technologies Inc.”
`
`Exhibit 2028.docR400-Sequencerdes
`
`79201 Bytes*** @ ATIHRcference Copyright Notice on Cover Page © +
`
`uE
`
`ORIGINATE DATE
`
`EDIT DATE
`
`DOC UMENT-REV. NUM.
`
`PAGE
`
`-
`—
`Author:
`
`24 September, 2001
`Laurent Lefebvre
`
`4 Septernber, 201519
`Sani
`ny
`
`GEN-CXXXXX-REVA
`
`
`
`1 of 58
`
`
` Issue To:
`| Copy No:
`
`ATI Ex. 2011
`
`IPR2023-00922
`Page 1 of 58
`
`

`

`
`
`PAGE
`R400 Sequencer Specification
`2 of 58
`aE
`24 September, 2001
`
`
`ORIGINATE DATE
`
`
`EDIT DATE
`
` 4 September, 201519
`
`
`
`
`
`
`
`:
`
`:
`
`OVERVIEW ..cetccccceecetrec tee eente ceed cn E RE RRR ERE End eden es nannenn een 6
`1.
`Top Level Block Diagram ooo cece cece ett teeeec trees esetraeeestrtsieeeetiteeseesteeeesnneees 98
`1.1
`Data Flow graph (SP)tc eee t ett bb
`teen tttttteettttieseteesttrenne 4340 Soe
`1.2
`CONMTO) GIDi rte nn D ttn ee tec freee ebobbitetetrtiieeeesiteeneeeteeeennea TAM
`1.3
`INTERPOLATED DATA BUS.
`2.
`4744
`INSTRUCTION STOREQe
`3.
`4744
`SEQUENCER INSTRUCTIONS..
`4.
`CONSTANT STORESocc cceccensaneceaeannnaaecncanennanaesnsanannanensisaneaaanedsasnnnaensesonnananensas 144
`§.
` MEMOry OFQaANIZAtlONS oo cc cce cc eee eetebeeeeeeeeeseeeeeseeesccstiseeeeesttstteeettsaeeeenttasass 4744
`S.1
`5.2 Management of the Control Flow Constanis......
`a (BAS
`$3 Management of the re-mapping tables oo ccc cetee tees teeter etttseteeetneenaee 1840 0
`5.3.1
`R400 Constant managementett 4845
`
`Proposal for R400LE constant management...
`5.3.2
`4845
`Dirty bitsttecities 2047 |
`S303
`S.3.4
`Free List Block ccc es ceee estes es isernsrissotnesternensrntnieenennaennnsiniteeeen 204%
`§.3.5
` De-allocate BlOCkccc teeter tenet tie ettitter tintin ete 21498
`5.3.6
`Operation of Incremental modelett 2148
`$4 Gonstant Store INdexingee eee ee tee ebb tttee ett eteeeeetneesas 2148 2)
`
`
`$5 Real TIME COMMANIS. cecececeeeteeeeteeescenecneteeccbbteeteetecititetettntiteeetttaiass 2210 @ 2
`
`$3.6 Constant Waterfalling cccceceteeeebeteebesseeeteetesessetesetetsistettesetstetttesenias 2210
`6& LOOPING AND BRANCHEScecececeeescreeneaneneenoienaencece nennenenees 2320
`
`
`
`
`6.1 The controlling state. cececccceceeeeeceeeeenseseteeeeeeerseversstevceetneeeeseenuteeteernees 2320
`
`
`6.2 The Gomtral Flow Program ooo.c cece ceceeecteeec eee cuseeeceeuuieeeseccuuieeseeeorsa 2320
`
`6.3 Data dependant predicate INStructions cece cts ete eeteeeeessettetteeesctttteerersies 29ee 28
`64 HW Detection of PVPSccc creer ceevvv eevee vs seereeevevtuseveseoeesueeeeyeyenas 2923
`6.5 Register file indexing. reser eevee nev neerorvsesensiioisissresstrsinsessriavsniseens 2923
`
`3023
`6.6
` Predicated Instruction support for Texture clauses 00000...
`
`6.7 Debugging the SNaders cee ccc eect terre tenet tees cteebteteeecstttittessscttteeetrrcies 3023
`6.7.1 Method 4: Debugging registers oooee ec cec i tettteeetetetstetrertetnttietes 3028 -—
`6.7.2 Method 2: Exporting the values in the GPRs (12). ccc ceteetee 3024
`7 POEL KILL MASKocin iCninecieseecaneanenences 3124
`
`
`MULTIPASS VERTEX SHADERS (HOS)... cece ee cesesnnneersenesnencestenenanecesenenneneeees siz4
`8
`9.
`REGISTER FILE ALLOCATION oo ecccccc ccc cne ec ennee reas cena na nna ennaaan neces sunanenneasaanaanences 3124
`
`
`
`
`
`10. cece cenceeeceeeee niesecaeeceteesaeceeaneeecnceeeecteneeaneeecees 3226FETCH ARBITRATION.oocccccccccccccncececneccecee
`
`
`
`
`11. ALU ARBITRATION occseenteencen ERRRnERnnneecaneenna 3226
`
`12. HANDLING STALLS oss tercieenne stern teen eeeeene encnnennna ne Senneeceanneaeeteene ees sd2Fo
`
`
`
`CONTENT OF THE RESERVATION STATION FIFOSLo. cecccseccccsseccsneecssenenstncessnenenneess 332% 5
`13.
`14. THE OUTPUT FILE cececccccee ee ee nnEE EEnnneencane Boee |
`
`18.
`1D FORMAT occ centre ceee nner cnae nn nae eee na nena ee saa nana ae cae nae cannes cnaaannn nets naaaaeee 332%
`15.1
`Interpolation of constant attributes oo eect tert tite treetteteerrttteeeerrciea 3428
`16. STAGING REGISTERSweccccsseccessscsencenerenseeecesenesneneesseneaneeceseueuneueesesaueunaneessnanennaeees 2428
`
`
`17.
`THE PARAMETER CACHE. ..ucccccn
`36.
`18. VERTEX POSITION EXPORTING. cecccccccseccssencecssencunencenanansonescssancusancessceneuaneesanaunnneeses 3730 |
`
`Table Of Contents
`
`
`
`
`
`:
`
`Exhibit 2028,doch409_Sequencercdac
`
`79201 Bytes™** © ATIHEcference Copyright Notice on Cover Page © **
`
`AMD1044_0017309
`
`ATI Ex. 2011
`IPR2023-00922
`Page2 of 58
`
`ATI Ex. 2011
`
`IPR2023-00922
`Page 2 of 58
`
`

`

`
`
`
`
`Vat
`ORIGINATE DATE
`EDIT DATE
`DOCUMENT-REV. NUM.
`PAGE
`© " ¢
`24 September, 2001
`4 September, 201549
`GEN-CXXXXX-REVA
`3 of 58
`
`
`
`
`
`Exhibit 2028.,docR409_Sequencerdac
`
`79201 Bytes*™** © ATI HEcference Copyright Notice on Cover Page © **
`
`19.
`EXPORTING ARBITRATION .o..ccccccccceccerccseceaneenceceanmeneanteeesenesanceeseneanseesenmataaeceemeanaaesens 3730
`20.
`EXPORTING RULES ooo eeccccccccccccceseneanenecsenaeeanceeeaaeennenecesseaueennessnaneanerscemeanaerccaneaneaesees 3830
`
`20.1
`Parameter caches Exportsotter titt teeter SES
`20.2
`M@EMOry OXDOMSen eee tt eee tb bt te ecb etbttteetidttteeeetne 3838
`20.3
`Position exporis........
`3830
`21.
`EXPORT TYPES.........
`3830
`21.1
`Vertex SHAGcece eee reece eee i tr Gree enn EEG Ste Dt tEttEeDdtttitteeitctteeetnnries 3834
`21.2
`Pixel SACIeee ce ee ceeeeeeeebbeeeeeessesseaeeeeeeesseeeeeeenestiettvntretsseraeeeeeees 3834+
`
`22.
`SPECIAL INTERPOLATION MODES.....
`3934
`
`22.1
`Real time COMMANS ooo. ccc cece cece ceetttr ees
`. 3934+
`
`22.2
`.. $834
`Sprites/ XY screen coordinates/ FB information...........
`
`22.3
`Auto generated COUNPETS 0. ccece cc rrteeeeetties
`3932
`22.3.19Vertex SRaderSccc eer eee ee tee etttttttrcitititet tttttttttittnnern 3932
`223.2
`Pixel shaders.ett
`3922
`23.
`STATE MANAGEMENT.......
`40233
`23.1
`Parameter cache synchronization ......
`4033
`24.
`XY ADDRESS IMPORTS... ccc
`4033
`
`24.19Varlex Indexes IMPOIS cece ccs teeee ttt treteeetriseeeetttsseteeettseeeccstteteeeccra 4033
`28, REGISTERS ..0....cccccceccceccesceeeeeeceneeeeceeeeue nee ceneeu tens cee sae nusic ese dune sees qudceneessnusneneesecentenmeesess 4133
`teen ttttteetttteenees
`ve
`251
`COMPOLce ete e ttt
`4133
`
`
` CONTEXE eect rte teteet cette teeeeebtbetttnnnrrea
`25.2
`4133
`
`26. DEBUG REGISTERS......
`« AZBA
`
`wn 4234
`26.1 Context...
`ZO 2 COMOEEE EEE EEE eet E eee DERE Dec e bene Ee eebbebteeeeeebcttteeeetnenies 4234.
`27.
`INTERFACESccc cccccsessssensenessesserenserssseesssnessessseentensstessenseteassneateneeneess 4235
`27.4
`External Interfaces ccc ccececevevsvsesvseressrevavevevevevseveavseevevevevevevieas 4235
`27.2
`SC to SP INM|SMACESocc cece cect e cent teeter dee eet bbbbbbbbbbbtbtttteaaaatesaateeeeeecceee 4235
`QT 2 SCSPB cece cette eettecrteeteetiitteittietiiteititter sitesi 4235
`QT SOSoc cece cece tates tent reteset testis ttititietiittititiesttitttiterttettettttteticees 4336
`27.2.3
`SQ to SX interpolator DUS oo ccc eee ttte ete ettrtetre titttsttrtetitteennees 4532
`27.2.4
`SQto SP: Staging Register Datace cette treet tetttttettretticees 4537
`27.2.5
`VGT to SQ. Vertex interface.eer etree titties rrsesnsernes 4538
`272.6
`SQ to SX: Control BUS cece tree vititetitie etitrertetietititittircititteenenees 4944
`27.2.7
`SX to SQ: Output file COMPOcc cece tet eettre ttttttsttrtintntetee 4944
`27.2.8
`SQ to TP: Control USocc ccc cece crete tetteetttsutterettitinettttettresicees 5044
`27.2.9
`TP to SQ: Texture Staller etter att errinrsnsserrssrrneris 5142
`27.210 SQ to SP: Texture stale eer tern nt err rnierssrrnees 5142
`272.11 SQ to SP: GPR and auto counteroe ersten rrssnnerns 5142
`27.212 SQ to SPic Instructionsoes eve seri enet sini niserrrmsrsers 5243
`
`5243
`27.2.13 SP to SQ: Constant address load/ Predicate Set...
`27.214 SQ to SPx: constant broadcast. ccc ce etttttteetttettttttttetttterres 5344
`27.215 SPO to SQ: Kill vector loadcee terre never riser ernst 5344
`272AG SQ to CP: RBEBM DUS. ccc cece tees tenets titeetiitttettetscreseetitetitnteenees 5344
`
`
`
`
`
`AMD1044_0017310
`
`ATI Ex. 2011
`IPR2023-00922
`Page 3 of 58
`
`ATI Ex. 2011
`
`IPR2023-00922
`Page 3 of 58
`
`

`

`
`
`
`
`ORIGINATE DATE
`EDIT DATE
`R400 Sequencer Specification
`PAGE
` Lot 24 September, 2001 4 September, 201519 4 of 58
`
`
`
`27217 CP to SQ: RBBM DUS. ccc cc cece tteetreteetittesititititertititttettittttteeres o344
`27.218 SG to CP: State report oo cect ee tttette se tittettitsetitititttrtttetteennes 584400
`2B. OPEN ISSUES... escssecccssssescsssescssnscssssseesssnsesssnseessssssassnssessssssssanscssneesesnsecssnuecessnsessaneeessoes 5844
`
`
`
`
`
`Exhibit 2028.,doch409_Sequencerdoc
`
`79201 Byies™** © ATI HEcference Copyright Notice on Cover Page © **
`
`AMD1044_0017311
`
`ATI Ex. 2011
`IPR2023-00922
`Page 4 of 58
`
`ATI Ex. 2011
`
`IPR2023-00922
`Page 4 of 58
`
`

`

`
` ORIGINATE DATE
`EDIT DATE
`DOCUMENT-REV. NUM.
`PAGE
`|
`
`rts i oes
`
`4 September, 201519
`GEN-CXXXXX-REVA
`5 of 58
`|
`24 September, 2001
`
`Revision Changes:
`Rev 0.1 (Laurent Lefebvre}
`Date: May 7, 2001
`
`First draft.
`
`:
`
`Rev 0.2 (Laurent Lefebvre)
`Date : July 9, 2007
`Rev 0.3 (Laurent Lefebvre)
`Date : August 6, 2001
`Rev 0.4 (Laurent Lefebvre)
`Date : August 24, 2001
`
`Rev 0.5 (Laurent Lefebvre)
`Date : September 7, 2001
`Rev 0.6 (Laurent Lefebvre)
`Date : September 24, 2001
`Rey 0.7 (Laurent Lefebvre)
`Date : October 5, 2001
`
`Rev 0.8 (Laurent Lefebvre)
`Date : October 8, 2001
`Rev 0.9 (Laurent Lefebvre)
`Date : October 17, 2001
`
`Rev 1.0 (Laurent Lefebvre)
`Date : October 19, 2001
`Rev 1.1 (Laurent Lefebvre)
`Date : October 26, 2001
`
`Rev 1.2 (Laurent Lefebvre)
`Date : November 16, 2001
`Rev 1.3 (Laurent Lefebvre)
`Date : November 26, 2001
`Rev 1.4 (Laurent Lefebvre)
`Date : December 6, 2001
`
`Rev 1.5 (Laurent Lefebvre)
`Date : December 11, 2001
`
`Rev 1.6 (Laurent Lefebvre)
`Date : January 7, 2002
`
`Rev1.7 (Laurent Lefebvre)
`Date : February 4, 2002
`Rev 1.8 (Laurent Lefebvre)
`Date : March 4, 2002
`
`Rev 1.9 (Laurent Lefebvre)
`Date : March 18, 2002
`Rey 1.10 (Laurent Lefebvre)
`Date : March 25, 2002
`Rev 1.17 (Laurent Lefebvre)
`Date : April 19, 2002
`Rev 2.0 (Laurent Lefebvre)
`Date April 19, 2002
`
`Changed the interfaces to reflect the changesin the
`SP. Added some details in the arbitration section.
`Reviewed the Sequencer spec after the meeting on
`August 3, 2001.
`Added the dynamic allocation method for register
`file and an example (written in part by Vic) of the
`flow of pixels/vertices in the sequencer.
`Added timing diagrams (Vic)
`
`the new R400
`
`reflect
`spec to
`Changed the
`architecture. Added interfaces.
`instruction
`Added
`constant
`store management,
`store management, control flow management and
`data dependant predication.
`Changed the control
`flow method to be more
`flexible. Also updated the external interfaces.
`Incorporated changes made in the 10/18/01 contro!
`flow meeting. Added a NOPinstruction, removed
`the
`conditional_execute_or_jump. Added
`cebug
`registers.
`Refined interfaces to RB. Added state registers.
`
`delta
`SEQ-»SPO interfaces. Changed
`Added
`precision. Changed VGT—SP90interface. Debug
`Methods added.
`Interfaces greatly refined. Cleaned up the spec.
`
`Addedthe different interpolation modes.
`
`Added the auto incrementing counters. Changed
`the VGT-»SQ interface. Added content on constant
`management. Updated GPRs.
`Removed from the spec all interfaces that weren't
`directly tied to the SQ. Added explanations on
`constant
`management.
`Added
`PASO
`synchronization fields and explanation.
`Added more details on the staging register. Added
`detail about
`the parameter caches. Changed the
`call
`instruction to a Conditionnal_call
`instruction.
`Added
`details
`on
`constant management
`and
`updated the diagram.
`in the 3X
`Added Real Time parameter control
`interface. Updated the control flow section.
`Newinterfaces to the SX block. Added the end of
`clause modifier,
`removed the
`end of clause
`instructions.
`Rearangementof the CF instruction bits in order to
`ensure byte alignement.
`Updated the interfaces and added a section on
`exporting rules.
`Added CP state report interface. Last version of the
`spec with the old control flow scheme
`New control flow scheme
`
`Exhibit 2028.doch400-Sequenverdec
`
`79201 Bytes*** © ATI HEcference Copyright Notice on Cover Page © **
`
`AMD1044_0017312
`
`ATI Ex. 2011
`IPR2023-00922
`Page 5 of 58
`
`ATI Ex. 2011
`
`IPR2023-00922
`Page 5 of 58
`
`

`

`
`
`1. Overview
`jgntiThe sequencer chooses two ALU slauses-threads and a fetch slause
`:
`:
`se
`ark
`
`hreadto execute, and executesail of the instructions in a clause-block before looking for a newclause of the same
`
`type. Two ALU clauses threadsare@ executed interleaved to hide theALU latency. Each vector all haveie eisfateh
`
`
`
`
`
`|
`
`i
`|
`
`ORIGINATE DATE
`
`24 September, 2001
`
`EDIT DATE
`
`SE
`oes
`4 September, 201549
`
`R400 Sequencer Specification
`
`PAGE
`
`6 of 58
`
`acute. The arbitrator will give
`older threads, L-will-net-execuie-an-aly
`npleted.-There are two separate sets-of
` reservation stations, one, for pixel vectors and cone5 for vertices vectors. This way a pixel can pass a vertex and a
`
`vertex can pass a pixel.
`
`To support the shader pipe the sequencer also contains the shader instruction cache, constant store, contro! flow
`constants and texture state. The four shader pipes also execute the same instruction thus there is only one
`sequencer for the whole chip.
`
`The sequencer first arbitrates between vectors of 64 vertices that arrive directly from primitive assembly and vectors
`of 16 quads (64 pixels) that are generated in the scan converter.
`
`The vertex or pixel program specifies how many GPRs it needs to execute. The sequencer will not start the next
`vector until the needed spaceis available in the GPRs.
`
`Exhibit 2028 dock400_Sequercerdes
`
`73201 Bytes*** © ATI HEcference Copyright Notice on Cover Page ©
`
`AMD1044_0017313
`
`ATI Ex. 2011
`IPR2023-00922
`Page 6 of 58
`
`ATI Ex. 2011
`
`IPR2023-00922
`Page 6 of 58
`
`

`

`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`|e
`
`|RoyAD
`
`
`;>|AYOLSLSNI
`NUTSOT—oeiCdLogLNefLe
`
`MYre||euauniod|
`_||pur}SLWLSHOLS
`
`
`
`
`
`
`we@@HEd18409UOSTONJUBUAdODsous.0joyBBLy@8Ms0rczcopssoennespopyoopazaznaps
`YTRLG
`fs|dSdsdsdsSHOISS|x|WLPommnmmnrnnnneLNTCELL
`__|PO/OdgO/oddO/dd«-)GO/dana|!Yi/8
`
`
`~~oe|»Gal|~Gel»el|eeaeme
`
`°l"PPeiSadw5|a
`
`
`
`
`
`
`eeeSeereeeemmseeeceaimmemaumiemiae|8GJO2WABE-XXXXXO-NAD
`
`FT‘tARLNTEXp
`
`
`BtGLOdVequiejdespyL00ZJequisidespz
`
`
`|3ovdWAN(ASMLNSINDO0aLlvdLidadivdalyNnigio
`awsaDed_paispe/:
`-—~kt,LOSLISMXL
`
`
`ov|*|JSNI
`
`_||avonNnyrlLSM
`
`
`aanaSlNi--)YSaiN
`LeL
`yeVESSOHS71|Be
`hooomeplSINDByTOSLNODaa
`
`
`
`MOIAIIAOJg0uenbeg[eleuey3]any
`49Li49shay|
`gopee
`
`ToRINCSpaddlerSLNVLSNOD
`
`
`
`
`
`
`
`
`-do
`
`AMD1044_0017314
`
`ATI Ex. 2011
`IPR2023-00922
`Page 7 of 58
`
`ATI Ex. 2011
`
`IPR2023-00922
`Page 7 of 58
`
`
`
`
`
`
`
`
`
`

`

`
`
` ORIGINATE DATE
`
`PAGE
`R400 Sequencer Specification
`EDIT DATE
`ry
`24 September, 2001
`8 of 58
`4 September, 201519
`
`Exhibit 2028.,doch409_Sequencerdoc
`
`79201 Byies™** © ATI HEcference Copyright Notice on Cover Page © **
`
`AMD1044_0017315
`
`ATI Ex. 2011
`IPR2023-00922
`Page 8 of 58
`
`ATI Ex. 2011
`
`IPR2023-00922
`Page 8 of 58
`
`

`

`
`
`
`
`PAGE
`DOCUMENT-REV. NUM.
`EDIT DATE
`ORIGINATE DATE
`|
`
`L L fy
`|
`24 September, 2001
`4 September, 201549
`GEN-CoOOOG-REVA
`9 of 58
`|
`
`1 Top Level Block Diagram
`
`
`
`
`
`Exhibit 2028 dock400_Sequerverdes
`
`73201 Bytes*** © ATI HEcference Copyright Notice on Cover Page ©
`
`AMD1044_0017316
`
`ATI Ex. 2011
`IPR2023-00922
`Page 9 of 58
`
`ATI Ex. 2011
`
`IPR2023-00922
`Page 9 of 58
`
`

`

`
` miLa *©°
`
`
`
`
`
`ORIGINATE DATE
`24 September, 2001
`
`EDIT DATE
`4 September, 201519
`By
`i
`oe
`
`R400 Sequencer Specification
`
`PAGE
`10 of 58
`
`|
`|
`|
`
`Input Arbiter
`
`||
`|
`|
`
`|||
`
`|
`
`I
`
`'
`
`;
`
`VIX RS
`
`PIX RS
`
`—_
`
`
`
`|
`—>|
`
`
`
`oS=@ =
`
`
`
`Texture
`
`1>i5
`
`
`
`Exhibit 2028.,doch409_Sequencerdoc
`
`79201 Byies™** © ATI HEcference Copyright Notice on Cover Page © **
`
`AMD1044_0017317
`
`ATI Ex. 2011
`IPR2023-00922
`Page 10 of 58
`
`ATI Ex. 2011
`
`IPR2023-00922
`Page 10 of 58
`
`

`

`
`EDIT DATE
`DOCUMENT-REV. NUM.
`PAGE
`
`11 of 53
`GEN-CXXXXX-REVA
`4 September, 201549
`24 September, 2001
`A
`r
`AP
`
`
` ORIGINATE DATE
`
`
`
`vertex/pixel vector arbitrator
`
`Possible delay for available GPR's G
`
`
`
`
`
`
`
`@j—Aliletanse 4
`eservation station
`
`4
`
`FIFO -t——|
`PE
`
`
`
`
`oxture clause OQ
`|
`
`| FIFO+ |
`eservationstation |
`FIFO
`i
`<
`[Eo] <
`ATA clanse 0
`reservation
`statior
`
`
`FIFO
`la@——peservation station
`>
`Texture clause 1
`been
`!
`6
`\
`
`eservationstation |
`eV HES
`extire arbitrator
`
`
`Lag—ATT clanse 1 < [FIFOa |
`
`
`:
`eservation station
`IK
`
`[PPO]
`er.eservationstation
`oxture arbitrator
`
`
`
`ALU clause 2
`
`‘eservation station
`——————“—,
`plPFO]
`Bes
`
`(onl
`Texture clanse 3
`|
`
`AES
`eservationstation |
`
`begAT clanseos Lo
`-
`j
`rer
`eservation station
`FIFO
`L
`
`Lo
`iPexture clause 4
`
`eservation station |
`
`
`
`
` g@——ALU clause 7
`
` TH clanse 5
`eservation station
`
`reservation station.
`|
`
`
` < [RRS}jw@gALU clause 6 [FROTreservation station
`
`
`eecencey
`_
`
`| FIFO LpiPexture clause 7
`
`4
`eservation station
`
`
`Lg] FIFOLg___
`eservation station
`
`
`
`Figure 2: Reservation stations and arbiters
`Thereare twe-sets-of the above figure,one for-vertices andonetor-sixels.
`
`
`
`
`screen cositionineeded) are senitc the interpolator whichwilluseihemiic interpolate theparametersandclace-the
`
`
`ore—the—in
`
`Exhibit 2028.,docR409_Sequencerdac
`
`79201 Bytes*™** © ATI HEcference Copyright Notice on Cover Page © **
`
`
`
`
`
`
`
`
`
`AMD1044_0017318
`
`ATI Ex. 2011
`IPR2023-00922
`Page 11 of 58
`
`ATI Ex. 2011
`
`IPR2023-00922
`Page 11 of 58
`
`

`

` ORIGINATE DATE
`
`24 September, 2001
`
`EDIT DATE
`4 September, 201519
`ii
`
`R400 Sequencer Specification
`
`PAGE
`12 of 58
`
`
` buffor. Hike outout puter ie fullerdosen't have enough&space the sequencer wil prevent such.a ‘vertex group. to
`
`enterar-exporting clause.
`
` decode bus-al-onetime,Similaryonkione> fetch state.machine may.haveaaocees-_itotheregisier file addrese-buc.at
`
`one-time.‘Arbitration.iserformed.by three:arbiter-blocke(hwoforthe-ALAstate.Machines and’eOFe-for the fetchstale
`Under this new scheme,
`the sequencer
`(SQ) will only use one
`global state management machine
`per vector t
`ixel, vertex) that we call the reservation station (RS).
`
`
`Exhibit 2028.,doch409_Sequencerdoc
`
`79201 Byies™** © ATI HEcference Copyright Notice on Cover Page © **
`
`AMD1044_0017319
`
`ATI Ex. 2011
`IPR2023-00922
`Page 12 of 58
`
`ATI Ex. 2011
`
`IPR2023-00922
`Page 12 of 58
`
`

`

`
`
`
`PAGE
`DOCUMENT-REV. NUM.
`EDIT DATE
`ORIGINATE DATE
`|
`
`|
`24 September, 2001
`4 September, 201549
`GEN-CoOOOG-REVA
`|
`43 of 58
`{
`:
`fy
`{
`1.2 Data Flow graph (SP)
`
`
`
`possess
`
`e
`
`-
`
`
`——~ go
`
`Be ee
`
`
`
`Register File
`1
`
`
`re requ
`
`—~\
`
`requeg
`
`
`
`at | , &
`instruction
`
`
`
`
`
`
`
`
`
` (
`
`= cS*
`
`
`
`textureaddress
`
`
`pipeline stage
`|
`
`pipeline
`
`instruction
`
`
`
`
`
`SiraXxer
`
`
`
`
`
`
`
`
`
`
` Bpep
`
`texture}
`
`quest
`
`
`
`pipeline
`
`Lua
`oe
`a
`5)
`8
`=
`2
`
`ae
`S
`
`i
`
`i
`a
`texture re}
`
`
`
`so
`
`I
`
`te ,
`an
`j
`I
`
`\
`to PrimitiveAssembly Unit orRenderBackend
`r
`
`,
`
`est
`
`I
`
`|
`
`Exhibit 2028 dock400_Sequerverdes
`
`73201 Bytes*** © ATI HEcference Copyright Notice on Cover Page ©
`
`Figure 3: The shader Pipe
`
`AMD1044_0017320
`
`ATI Ex. 2011
`IPR2023-00922
`Page 13 of 58
`
`ATI Ex. 2011
`
`IPR2023-00922
`Page 13 of 58
`
`

`

`
`
` |
`
`PAGE
`R400 Sequencer Specification
`EDIT DATE
`ORIGINATE DATE
`
`| Agadir
`24 Septernber, 2001
`4 September, 201819
`44 of 58
`|
`|
`The gray area represents blocks that are replicated 4 times per shader pipe (16 times on the overall chip).
`
`1.3 Control Graph
`
`Clause # + Rdy
`WrAddr
`CMD
`
`cst
`
`:
`
`iS
`
`SEQ
`
`CST
`
`|
`|
`||
`|
`|
`|
`i
`|
`|
`*
`
`WrAddr
`
`
`|
`Phase
`P|
`
`emp CSTcst:estipx 4
`c Wivec
`Rance
`|
`Co
`| WrSeal wrAddr
`
`}
`vow le
`
`sey
`
`=
`
`FETCH
`
`SP
`
`OF
`
`WrAder
`
`Figure 4: Sequencer Control interfaces
`
`in red the ALU control interface, in blue the Interpolated/Vector
`In green is represented the Fetch control interface,
`control interface and in purple is the outputfile control interface.
`
`2. Interpolated data bus
`The interpolators contain an IJ buffer to pack the information as much as possible before writing it to the register file.
`
`Exhibit 2028 dock400_Sequercerdes
`
`73201 Bytes*** © ATI HEcference Copyright Notice on Cover Page ©
`
`AMD1044_0017321
`
`ATI Ex. 2011
`IPR2023-00922
`Page 14 of 58
`
`ATI Ex. 2011
`
`IPR2023-00922
`Page 14 of 58
`
`

`

`
`
`ORIGINATE DATE
`
`EDIT DATE
`
`DOCUMENT-REV. NUM.
`
`24 September, 2001
`—
`
`4 September, 201349
`oy
`
`GEN-CXXXXX-REVA
`
`PAGE
`
`15 of 53
`
`
`
`
`
`
`=wile
`
`||
`lds CROSSBAR(4x64 bits}
`
`
`
`
`Aa
`Al
`Aa
`Ba
`
`
`Bt
`
`co
`
`ct
`
`2
`
`iJs buffer (ping-pong Suffer)
`(28 bits * 2 (10) + bits * 6 (delta Ws)+4 ©
`bits*6)* 16 (quads) * 2 (doubie-butffered)
`4096 bits
`32x 128
`
`2
`
`3
`
`4
`
`c3
`|
`C4
`cS
`be
`
`:
`+
`
`Bt
`
`b2
`
`Eo
`
`EA
`
`
`
`.
`i
`AD
`At
`Ag
`Bo
`i
`:
`
`
`Bt
`
`c3
`
`mt
`
`co
`
`C4
`
`bz
`
`c
`
`cs
`
`EG
`
`c2
`
`
`i
`Do
`i
`I
`!
`i
`|
`
`a
`
`XY¥s buffer (ping-pong buffer)
`24 bits * 16 quads *2
`768 its
`3ox24
`
`|
`
`
`
`
`
`
`
`
`
`
`INTERPOLATORS
`
`|
`
`|
`
`|
`
`!
`!
`FIX-FLOAT + EXPANSION
`
`|
`
`a
`
`
`
`Figure 5: Interpolation buffers
`
`Exhibit 2028.,docR409_Sequencerdac
`
`79201 Bytes*™** © ATI HEcference Copyright Notice on Cover Page © **
`
`AMD1044_0017322
`
`ATI Ex. 2011
`IPR2023-00922
`Page 15 of 58
`
`ATI Ex. 2011
`
`IPR2023-00922
`Page 15 of 58
`
`

`

`eelicelOLL
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`WILIStIpSUNUYUOpEod.ajUy79sansLy]
`
`
`
`
`
`
`
`
`
`we@@BUdJ8A0DUODTIIONJUBUAdODeoUe10;ouBMLy@8%9s0zezsoosovenes"pspyoorazazats
`
`
`
`
`
`aLydLidaFLVSLYNIOUO
` aFea 8GJOOLSral0eGUSMSSF|loozJequerdespz
`
`
`
`
`
`3OWduoleoyloedsseousnbes0O7Y
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`AMD1044_0017323
`
`ATI Ex. 2011
`IPR2023-00922
`Page 16 of 58
`
`ATI Ex. 2011
`
`IPR2023-00922
`Page 16 of 58
`
`

`

`
`
`EDIT DATE
`
`DOCUMENT-REV. NUM.
`
`PAGE
`
` ORIGINATE DATE
`
`a a
`17 of 58
`GEN-CXXXXX-REVA
`4 September, 201549
`24 Septernber, 2001
`Above is an example of a tile the sequencer might receive from the SC. The write side is how the data get stacked
`into the XY and IJ buffers, the read side is how the data is passed to the GPRs. The IJ information is packed in the IJ
`buffer 4 quads at a time or two clocks. The sequencer allows at any given time as many as four quadsto interpolate a
`parameter. They all have te come from the same primitive. Then the sequencer controls the write mask to the GPRs
`to write the valid data in.
`
`Instruction Store
`3.
`There is going to be only oneinstruction store for the whole chip. It will contain 4096 instructions of 96 bits each.
`
`It is likely to be a 1 port memory; we use 71 clock to load the ALU instruction, 1 clocks to load the Fetch instruction, 1
`clock to load 2 control flow instructions and 1 clock to write instructions.
`
`The instruction store is loaded by the CP thru the register mapped registers.
`
`The VS_BASE and PS_BASE context registers are used to specify for each context where its shader is in the
`instruction memory.
`
`For the Real time commandsthe story is quite the same but for some small differences. There are no wrap-around
`points for real time so the driver must be careful not to overwrite regular shader data. The shared code (shared
`subroutines) uses the same path as real time.
`
`4. Sequencer Instructions
`All control flow instructions and moveinstructions are handled by the sequencer only. The ALUs will perform NOPs
`during this time (MOV PV,PV, PS,PS) if they have nothing else to do.
`
`5. Constant Stores
`
`5.1 Memory organizations
`A likely size for the ALU constant store is 1024x128 bits. The read BW from the ALU constant store is 128 bits/clock
`and the write bandwidth is 32 bits/clock (directed by the CP bus size not by memory ports).
`
`The maximum logical size of the constant store for a given shaderis 256 constants. Or 512 for the pixel/vertex shader
`pair. The size of the re-mapping table is 128 lines (each line addresses 4 constants). The write granularity is 4
`constants or 512 bits.
`It takes 16 clocks to write the four constants. Real time requires 256 lines in the physical
`memory (this is physically register mapped).
`
`The texture state is also kept in a similar memory. The size of this memory is 320x96 bits (128 texture states for
`regular mode, 32 states for RT). The memory thus holds 128 texture states (192 bits per state). The logical size
`exposes 32 different states total, which are going to be shared between the pixel and the vertex shader. The size of
`the re-mapping table te for the texture state memory is 32 lines (each line addresses 1 texture state lines in the real
`memory). The CP write granularity is 1 texture state lines (or 192 bits). The driver sends 512 bits but the CP ignores
`the top 320 bits.
`It thus takes 6 clocks to write the texture state. Real time requires 32 lines in the physical memory
`(this is physically register mapped).
`
`The control flow constant memory doesn’t sit behind a renaming table. It is register mapped and thus the driver must
`reload its content each time there is a changein the control flow constants. Its size is 320*32 because it must hold 8
`copies of the 32 dwords of contral flow constants and the loop construct constants must be aligned.
`
`
`
`The constant re-mapping tables for texture state and ALU constants are logically register mapped for regular mode
`and physically register mapped for RT operation.
`
`Exhibit 2028 dock400_Sequerverdes
`
`73201 Bytes*** © ATI HEcference Copyright Notice on Cover Page ©
`
`AMD1044_0017324
`
`ATI Ex. 2011
`IPR2023-00922
`Page 17 of 58
`
`ATI Ex. 2011
`
`IPR2023-00922
`Page 17 of 58
`
`

`

` |
`
`
`
`ORIGINATE DATE
`
`24 September, 2001
`
`|
`
`EDIT DATE
`
`iy
`4 Seplember, 2015418
`
`R400 Sequencer Specification
`
`PAGE
`
`18 of 58
`
`5.2 Management of the Control Flow Constants
`The control flow constants are register mapped, thus the CP writes to the according register to set the constant, the
`S8Q decodes the address and writes to the block pointed by its current base pointer (CF_WR_BASE). On the read
`side, one level of indirection is used. A registerQ_CONTEXT_MISC.CF_RD_BASE) keeps the current base pointer
`to the control flow block. This register is copied wheneverthere is a state change. Should the CP write to CF after the
`state change, the base register is updated with the (current pointer number +1 )% number of states. This way, If the
`CP doesn't write to CF the state is going to use the previous CF constants.
`
`5.3 Managementof the re-mapping tables
`
`5.3.1 R400 Constant management
`The sequencer is responsible to manage two re-mapping tables (one for the constant store and one for the texture
`state). On a state change (by the driver), the sequencerwill broadside copy the contents ofits re-mapping tables to a
`new one. We have 8 different re-mapping tables we can use concurrenily.
`
`The constant memory update will be incremental, the driver only need to update the constants that actually changed
`between the two state changes.
`
`the physical memory MUSTbeat least twice as
`For this model to work in its simplest form, the requirement is that
`
`large as the logical address space + the space allocated for Real Time. In our case, since the logical address space
`is 512 and the reserved RT space can be up to 256 entries, the memory must be of sizes 1280 and above. Similarly
`the size of the texture store must be of 32*2+32 = 96 entries and above.
`
`5.3.2 Proposal for R400LE constant management
`To make this scheme work with only 512+256 = 768 entries, upon reception of a CONTROL packetof state + 1, the
`sequencer would check for SQ_IDLE and PA_IDLE and if both are idle will erase the content of state to replaceit with
`the newstate (this is depicted in Figure 8: De-allocation mechanismPigure-G:-De-alloeztion-meckaniem). Note that in
`the case a state is cleared a value of 0 is written to the corresponding de-allocation counter location so that when the
`8Q is going to report a state change, nothing will be de-allocated upon thefirst report.
`
`The second path sets all context dirty bits that were used in the current state to 1 (thus allowing the newstate to
`reuse these physical addressesif needed).
`
`Exhibit 2028 dock400_Sequercerdes
`
`73201 Bytes*** © ATI HEcference Copyright Notice on Cover Page ©
`
`AMD1044_0017325
`
`ATI Ex. 2011
`IPR2023-00922
`Page 18 of 58
`
`ATI Ex. 2011
`
`IPR2023-00922
`Page 18 of 58
`
`

`

`
`
`ORIGINATE DATE
`
`EDIT DATE
`
`DOCUMENT-REV. NUM.
`
`
`
`PAGE
`
`19 of 58
`
`24 September, 2001
`Free List
`
`4 September, 201519
`aE
`Ee
`
`GEN-CXXXXX-REVA
`
`,Logical Address
`
`
`
`
`
`
`
`
`Currenv/Last
`
`
`Context
`
`||
`(8 rows of 16-8
`
`
`bit physical =>
`i'
`
`
`
`128 entries copy||| a
`
`
`
`
`
`Logical Actress in eight clocks)||| Sonrext | & Context
`
`
`
`&
`|
`S
`!
`
`@ I= .
`
`|
`Context N
`|
`-—s, Physical
`‘|
`Address
`|
`
`
`
`(3 — Read_ptr &:
`———_—
`‘al
`
`Address
`to Allocate
`
`ci piysical address:
`
`
`Renaming Table
`Context 0 => N
`
`|
`
`
`
`
`
`
`
`Global Register
`ol
`Data Bus
`Constants
`|
`location
`avalableloapneaaa
`WRIR + Staging Write Addr||r
`
`7
`»!
`|
`
`
`
`physical
`address
`to
`schedule
`for
`de-alloc
`
`i
`Logicaleres
`GibRegBus
`when Isb are zero
`first word of write
`
`next
`physical
`adcress
`ready
`for allocate
`|
`|
`|
`|
`|
`|
`|
`|
`
`-
`
`
`
`i
`\
`|
`*
`- x
`x
`| Reset
`.
`‘5.4
`Renaming Table)
`|
`id
`for 1 Context
`|
`| Logical
`Logical
`CurrentfLast
`Le
`nae
`!
`Physical
`Address ja | Address K
`Address
`|
`|
`(Only
`(if set
`er
`
`|
`|
`pe
`de
`don't
`Logical
`4
`:
`allocate
`allocate
`Address
`|e
`-
`
`
`|
`if set}
`or de-
`
`allocate)|
`|
`|
`
`
`Copy Last held above to
`Current Context on receipt
`of Set Constant for a
`new context (Hide loading
`behind Set State load - 16 clocks)
`all other Set States just write one
`entry te current state.
`
`Exhibit 2028.,docR409_Sequencerdac
`
`79201 Bytes*™** © ATI HEcference Copyright Notice on Cover Page © **
`
`Figure 78: Constant management
`
`Staging Data
`Buffer
`
`
`
`
`
`Physical
`
`q
`Se
`Constant
`Request
`
`|
`i
`|
`| Context &
`1
`Logical
`Address
`
`Renaming
`table
`N-Contexts
`
`|
`
`
`
`
`
`|
`I
`
`
`
`AMD1044_0017326
`
`ATI Ex. 2011
`IPR2023-00922
`Page 19 of 58
`
`ATI Ex. 2011
`
`IPR2023-00922
`Page 19 of 58
`
`

`

`
`
`
`
`ORIGINATE DATE
`24 September, 2001
`
`EDIT DATE
`4 September, 2015419
`4
`ey
`
`R400 Sequencer Specification
`
`PAGE
`20 of 58
`
`ADDR
`
`SQ_STATE#
`
`
`
`,
`pe—WRITE_ENABLE
`-
`|
`!
`
`DEALOC
`COUNTERS
`
`| |
`
`CNT VALUE
`
`Free List
`
`|
`
`||
`PREVIOUS
`STATE
`|
`!
`|
`|
`rc VALUE -—__
`r+
`———|
`Is
`F
`|
`“
`VALID
`|
`| « ——
`—_—_— OR
`|
`|
`|<
`89 IDLE
`_____
`——, AND |
`PA_IDLE-——
`se CP_NEW_STATE_CNTL—
`te
`ae
`REMAPPING
`“<@—_SET CTX BITS
`TABLE
`
`
`NEW
`STATE
`|
`|
`|
`
`
`
`
`
`Figure 89: De-allucation mechanism for R400LE
`
`5.3.3. Dirty bits
`Two sets ofdirty bits will be maintained per logical address. Thefirst one will be set to zero on reset and set when
`the logical address is addressed. The second onewill be set to zero whenever a new context is written and set for
`each address written while in this context. The reset dirty is not set, then writing to that logical address will not
`require de-allocation of whatever address stored in the renaming table.
`Ifit is set and the context dirty is not set, then
`the physical address store needs to be de-allocated and a new physical address is necessary to store the incoming
`data.
`lf they are both set, then the data will be written into the physical address held in the renaming for the current
`logical address. No de-allocation or allocation takes place. This will happen when the driver does a set constant
`twice to the samelogical address between context changes. NOTE:
`It is important to detect and prevent this, failure
`to do it will allow multiple writes to allocate all physical memory and thus hang because a context will not fit for
`rendering to start and thus free up space.
`
`5.3.4 Free List Block
`A free list block that would consist of a counter (called the IFC or Initial Free Counter) that would reset to zero and
`incremented every time a chunk of physical memory is used until they have all been used once. This counter would
`be checked each time a physical block is needed, andif the original ones have not been used up, us a new one, else
`check the free list for an available physical block address. The coun

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket