throbber

`
` ORIGINATE DATE EDIT DATE DOCUMENT-REV. NUM. PAGE
`
`
`
`
`
`
`
`24 September, 2001
`4 September, 20154
`GEN-CXXXXX-REVA
`4 of 48
`—
`by
`
`Author:
`Laurent Lefebvre
`
`
`Issue To:
`Copy No:
`
`
`R400 Sequencer Specification
`
`SQ
`
`Version 1.87
`
`Overview: This is an architectural specification for the R400 Sequencer block (SEQ).
`[it provides an overview of the
`required capabilities and expected uses of the block. t also describes the block interfaces,
`internal sub-
`blocks, and provides internal state diagrams.
`
`
`
`
`
`AUTOMATICALLY UPDATED FIELDS:
`Decument Location:
`C\perforce400\doc_llb\designiblocks'sq\R400Sequencer.dac
`Current Intranet Search Title:
`R400 Sequencer Specification
`
`APPROVALS -
`as
`
`es
`:
`Signature/Date
`uu Name/Dept Oe
`
`
` Remarks:
`
`
`
`
`
`THIS DOCUMENT CONTAINS CONFIDENTIAL INFORMATION THAT COULD BE
`
`
`
`SUBSTANTIALLY DETRIMENTAL TO THE INTEREST OF ATI TECHNOLOGIES
`
`
`
`
`
`
`INC. THROUGH UNAUTHORIZED USE OR DISCLOSURE.
`
`
`“Copyright 2001, ATI Technologies Inc. All rights reserved. The material in this document constitutes an unpublished
`work created in 2001. The use of this copyright notice is intended to provide notice that ATI owns a copyright in this | =
`unpublished work. The copyright notice is not an admission that publication has occurred. This work contains
`confidential, proprietary information and trade secrets of ATI. No part of this document may be used, reproduced, or
`transmitted in any form or by any means without the prior written permission of ATI Technologies Inc.”
`
`Exhibit 2024.doeR400_Sequencerdec
`
`71269 Bytes*** © AT] Confidential. Reference Copyright Notice on Cover Page © ***
`
`ATI 2024
`
`LGvy. ATI
`TPR2015-00325
`
`AMD1044_0257135
`
`ATI Ex. 2107
`IPR2023-00922
`Page 1 of 260
`
`ATI Ex. 2107
`
`IPR2023-00922
`Page 1 of 260
`
`

`

`
`
`ORIGINATE DATE
`
`EDIT DATE
`
`
`
`R400 Sequencer Specification
`
`PAGE
`2 of 48
`
`a
`
`
`
`
`
`24 September, 2001
`
`TAO A
`4 September 20154
`
`
`
`TableOfContents
`
`1.
`
`OVERVIEW cecccccceecetrerceeennte cnn end RRR ERE CE eden ec cannenn een 6
`
`Top Level Block Diagram ooo ccc e ect ee cen eetneeeesnttaeeeeerssaeeeeebssaseeeeteseseeesnseereneneees 3
`1.1
`1.2 Data Flow graph (SP)oo ccc eee tr etre rb bette t tbr citteetittesteettteeceetttteeenintias 40
`LB
`COMPO) GPA eee eee rene n ere e cn EE Ee ED DD ttt Ee Db obttteeetiscieteesttsiseeesttssienenies 14
`
`
`
`
`
`2. INTERPOLATED DATA BUS wetcccseinerenereinencnanenneneed eennnneneeaanennenecenaes Woe
`
`INSTRUCTION STORE ooo enc ini ini anoint boone eur rea 4.
`3.
`4.
`SEQUENCER INSTRUCTIONS. ooo ccc ccc eecasaeenanenensennnnaenecannnnanencaeananaaaeensannsaaeneceann 16
`§.
`GONSTANT STORES 2c cece ccc ceceree reac eenne ee rece eeeneey cece ee enens cance nnneenceteerenneeccenmenenereeceaees 16
`6
`S.1
`M@MOry CFG ANIZATIONSocc eect eee creer renter Eee DD Dott teebbcoitteettissiteestasieneeesa 16
`3.2 Management of the re-mapping tables ooo ccc cee cccteectseeeeeteeetssteteetetessaeenaes 16 2.
`S201
`Dirty BIS cece tt teeter ti tititititesttttittre titi tititttieterecen 1948 2
`5.2.2
`Free List Blok occ ccc cece ee ee teee sete tseeeseeteneeseteisessiteseusieessinesiesiteestin 4948
`§.2.3
` De-allocate BOCK ooo ccc cc ceces tent trtteetetstetittitisetstetitetineciittretercreentenes 204948
`5.2.4
`Operation of Incremental Modelocc cette estetnttetttte teste 204848
`$3. Constant Store INCexing. cece centres eee testes eetteeeeesevueeeevestseeeersntbieseerennesaas 20419
`
`tbitettbsrtittettttiiteeeenes Zi204—
`$4 Real Time Commands...ee tenet tes
`
`$5 Constant Waterfallingocc tect eee tener Eee cr btttteccnttitteetbrcieeeniea 212049
`6&
`LOOPING AND BRANCHESocc eee cece seen de enenaedceciencanenceseneinen 222120
`B.1
`The COmTONINg State cece ccc eeee cc steteeetecciteeettasseeeettseseeeettsseeeceenaeees 222420
`6.2
`The Control Flow Program occ reece cette ete eeeeeetttteiseeetttteeesctteteeens Zeei2O
`6.3 Data dependant predicate INStrUCTIONS...0. teeter tere eee reper eerteeennee 2423
`6.4
`HW Detection OF PVPSote e etree tee et steers seeraeernsapenbrernsaerenaens 202423
`6.5 Register fhe INDEXING... cece cece ete e etc teteeeetttiteeeettaseeeesttiecceetttieeeecsttaeess 202423 |
`6.6
`Predicated instruction support for Texture CAUSES cette tttetettteees 2624
`6.7 Debugging the SNaGers occ eect tee cette ttee ett tsceeettttecteseteteeerstteeeees 202624 ©
`6.7.1 Method 4: Debugging registers occ cette ieee teeesttetetencneieaeieey 262524
`6.7.2 Method 2: Exporting the values in the GPRs (12) ooo teteeteee 2625
`7.
`PIXEL KILL MASK occerrre ne EES n nnn ede odenenedeneneaeteee 225
`8
` MULTIPASS VERTEX SHADERS (HOS)... cccccccccss cee ceeeeeecesssesseneecsesessnenesconncaans 212625 |
`4
`9.
`REGISTER FILE ALLOCATION oo. ccceccc cee ce cnn nae nena da aeinan neti oinnin anes 272625
`
`
`
`10. FETCH ARBITRATION. occ ecccccceeneecenceenaneecnnaannnaneencnannnnane snnaannnnaessaaaanaanaenaaanne 282726
`
`
`
`11. ALU ARBITRATION .ooocccccccccccccetececeecenceteeeeeeenceetegendcrcecencedenies caeeenennenecaeteee 282726
`HANDLING STALLS osc eet n eee nee eo enedned cece eenenenee 292827
`12
`13.
`CONTENT OF THE RESERVATION STATION FIFOSWo. .ccccceccccseccusenccsssncsnreesseneneen 202827
`THE OUTPUT FILE. cecccetcccsntesetessceeneenee ccc einnn eco neenne nen
`cnnennnanescananenaeeescenenenneesananeas 292827
`14.
`
`
`
`
`
`
`18. J FORMAT oe cee ie enne i nei aiiiaioaboooinnHannonblicaiboilnooouaiboarne oomobale iui 292827 |
`
`Interpolation of constant attributes cece cece cece settee etseeeeeeessees 302828
`15.1
`S
`16.
`STAGING REGISTERS wooo ccceccsstecsseeeeenecseneensen ee csnneennenesscnananntessnuenaaneeesnaueunaneesananeas 302928
`THE PARAMETER CACHE... ccccccssscsstecsssesssecssnensensessneensnenessanansnneeesseuennnneesnanens 323130
`17.
`
`
`ciccecccccessecccennseeeesseeneuseneessneneananerssanncaaeeessenenen 333430
`18. VERTEX POSITION EXPORTING.
`
`
`19. EXPORTING ARBITRATION Qocccccseccsecessescssnerseneessneensnenescnenanneessneueunaneesaneneas 333430
`
`EXPORT TYPES. ccecscccsceeeeeeceesnnnennn esas nnnsananesasnnaaanedsasnanaaneedsanamanaessasnaananadeasaneanas 333430
`20.
`20.1
`Vertex SHANGi en re rn teen DHE ttn Eb tttferbttttiteeercttteeeeer 333430
`
`
`
`Exhibit 2024, dochUoo_Sequencerdoc
`
`71260 Bytes*™** © ATI Confidential. Reference Copyright Notice on Cover Page © »*
`
`AMD1044_0257136
`
`ATI Ex. 2107
`IPR2023-00922
`Page2 of 260
`
`ATI Ex. 2107
`
`IPR2023-00922
`Page 2 of 260
`
`

`

`
`
`
`
`ORIGINATE DATE
`
`EDIT DATE
`
`DOCUMENT-REV. NUM.
`
`PAGE
`
`
`
`
`
`
`
`:
`
`20.2
`21
`21.1
`
`3 of 48
`GEN-CXXXXX-REVA
`4 September 204 a
`24 September, 2001
`PIKE SHSCING ccc e eee c nt ete e tbr eeteebt bb citeeeettiesseeetttaeeceetttseeescsttaeess 333230
`SPECIAL INTERPOLATION MODES. oo cccccsccscssnsccsssccnscanssssensnsnssssenusensssessannnnnnecssnssnnes 343234
`PR@@! TIME COMMANIS ooo. ccc ccc cec cece ee ebb ebbeebbeeseeeteee bbe bbbbteeeetecceueeatestntreeeeeeess 343234
`
`
`teteteeeees 343234
`Sprites! XY screen coordinates/ FB information...
`21.2
`Auto generated COUNTETSocc eect etree bier ette tt ttteentitteeeeersttteeee 343334
`21.3
`2LB 1
`Vertex shaders oie cece cceetesessesesessevsesseisvivsrevivetvetersvevrsettevsseseiees 343332
`ZLB.2
`Piel Shaders. cece sceeeecevesesevevesevsessvesvetrscvitetvetersveveeveservseseseiees 343332
`22.
`STATE MANAGEMENT oo cccccccscsssceccccnscnsnscsesssssnsccsnasenasanssssnannnavstssanssansseusannnanntssenssnnas 353332
`
`
`22.1 Parameter cache Synchronization occceceescentbteteesnetttteeeersttieeeeen 253332
`23.
`KY ADDRESS IMPORTS. oo ccsssccsccssssenensnsnansnanssesnannnsnessnnnnnsnanasenuanananvessennannansnsnnnannne 353432
`
`
`
`
`23.1 Verlex INCEXES IMPONTIScecececeteeetetitteeetttsceeetttitecteetttieeeestttieeees gos4as
`24. REGISTERS oo. cccccccccassccnsccnscsssssnnsssescenssanansesansnsnnuscsnasnnssanssssnannnantsssanssnnssensannnsnnnsssnssnans 363433
`
`DAL COMO eee cce cc cet tenet eer d eee eee bebe bbb bib bb HEED GHEEEEEEEOecdbbttetcitetttnaaasaaaaaeeess 363433
`QAD CONTE ec c ec cee cece eee en EE OLE EEE EEE EEG edd DEED EDDbttbeeeeee tsb tebeb ebb sctteeteseeeseeeeeaene® 363433
`
`2%. DEBUG REGISTERS Lo ccccccacseeuuceceeneneuausseneusaneeceesadaeteuetsnaauaaaesecsuaedauceetedateteuseceusanaee 373834
`251 COMTEXcc eect ee eer eee ec rn EEE EEO bet tet bt batteetttisteeetttitectentteteeecnttieeees 3/3ee4
`
`
`
`INTERFACES ccc ccsseessesesssssecsscssssessncsesecsneenencesssssmsnansesssantannesseronneeen 373534
`26.
`External Interfaces.eee reretetrnsctivitnrnannens 373534
`261
`26.1.1
`SC to SQ: lJ Control busort eres tren rsneerenes 373634
`26.1.2
`SQ to SP: Interpolator DUS cece ete eet ese tsteteteeertetitetsteteteieees 383635
`
`SQ to SP: Parameter Cache Read control bus...
`26.1.3
`383635
`SQ to Sx: Parameter Cache Mux control BUS ooo cccccccceeeeeeseereeeseeees 393736
`26.1.4
`26.1.5
`SQ to SP: Staging Register Data oo cee cetnttettnttettteeete: 393236
`
`
`26.1.6 PA to SQ: Vertex interface oooccccecsececsceecsessevevetesessveveesesetieeeserees 393236
`SQ to CP: State report cece cess cette tsese estes ersten: 424439
`26.1.7
`26.1.8
`SQ to SX: Control DUS cece cecsescesescesvsevevssceveteetssvevettseevssssvaees 424439
`26.1.9
`SX to SQ: Output file Controlocc cece cs tr tetetttettittettite rset 424439
`26.1, 10 SQ to TP: Control busocc cccceesesecsveevsessevivetestsstivettesesvisssviees 424439
`26.1. 11 TP to SQ: Texture Stale eceesesecsvsecsersevevetveesstevettevessiesessiees 434240
`26.1,.12 SQ to SP: Texture stall cece cece cesesscsvsevevsscvevetvsvssvevetssseviesesviees 434240
`26.1.13 SQ to SP: GPR, Parameter cache control and auto counter oo. 434240
`26.1. 14 SQ to SPx: INStUCTIONS ooo ccc ecceceeesesescesesveevescrevetereesveteeteteesvenseserees 444344
`26.1,.15 SP to SQ: Constant address load ooo ccc ceccecceceseeeeteevvevesvesvreveneseeeees 454444
`
`26.1.16 SQ to SPx: constant broadcast o.oo
`cccceseeececcsevevervevessveveseestesveeeseeeees 454444
`26.1,.17 SPO to SQ: Kill vector load ooo ccc ceeesececscsecsersevsvetvetssvevesveveeviseesviees 454442
`26,1.18 SOQ to CP: REBM Bus ooo cccccccccccceeeecsessevevveevssevevvsavssuvevvaavesevsevaavenvestvaaveevs 454442
`26.1.19 CP to SQ: RBBM BUS cece eee e ee ett t ttt ttttttettten 404442
`27.
`EXAMPLES OF PROGRAM EXECUTIONS oo... ..ccccsssecsccnsnenenscnenensnececenannanensuenananens 464442
`
`27.1.1
`27.\.2
`QT VS
`
`Sequencer Control of a Vector ofVertices 464442
`Sequencer Control of a Vector ofPixels A4TAB43
`NOLO ooo coco ce cece ce cece cece eee be cee te ede eteteetestvessestessvssveveevessseveveetesveneisesees 484644
`
`Exhibit 2024,doch409_Sequeneerdas
`
`71260 Bytes** © ATI Confidential. Reference Copyright Notice on Cover Page © »*
`
`AMD1044_0257137
`
`ATI Ex. 2107
`IPR2023-00922
`Page 3 of 260
`
`ATI Ex. 2107
`
`IPR2023-00922
`Page 3 of 260
`
`

`

`
`
`PAGE
`R400 SequencerSpecification
`EDIT DATE
`ORIGINATE DATE
`Vat
`Bethan
`os
`SAAS A
`|
`4 of 48
`4 September, 20154
`24 September, 2001
`© Fi
`| 28. OPEN ISSUES occ eri cure nie nner inn eneeene cer sneetenncecnenenmeneeseencns 484744
`
`Exhibit 2024.dochUoo_Sequencerdos
`
`71260 Bytes*™** © ATI Confidential. Reference Copyright Notice on Cover Page © »*
`
`AMD1044_0257138
`
`ATI Ex. 2107
`IPR2023-00922
`Page 4 of 260
`
`ATI Ex. 2107
`
`IPR2023-00922
`Page 4 of 260
`
`

`

`
`
`
`
`ORIGINATE DATE
`
`EDIT DATE
`
`24 September, 2001
`
`Ra
`
`
`
`4 Seplember, 20154Yarawe!
`
` DOCUMENT-REV. NUM.
`
`GEN-CXXXXX-REVA
`
`PAGE
`5 of 48
`
`Revision Changes:
`
`Rev 0.1 (Laurent Lefebvre)
`Date: May 7, 2001
`
`Rev 0.2 (Laurent Lefebvre)
`Date : July 9, 2001
`Rev 0.3 (Laurent Lefebvre)
`Date : August 6, 2001
`Rev0.4 (Laurent Lefebvre)
`Date : August 24, 2001
`
`Rev 0.5 (Laurent Lefebvre)
`Date : September 7, 2001
`Rev 0.6 (Laurent Lefebvre)
`Date : September 24, 2001
`Rey 0.7 (Laurent Lefebvre)
`Date : October 5, 2001
`
`Rev 0.8 (Laurent Lefebvre)
`Date . October 8, 2001
`Rev 0.9 (Laurent Lefebvre)
`Date : October 17, 2001
`
`Rev 1.0 (Laurent Lefebvre)
`Date : October 19, 2001
`Rev 1.1 (Laurent Lefebvre)
`Date : October 26, 2001
`
`Rev 1.2 (Laurent Lefebvre)
`Date : November 16, 2001
`Rev 1.3 (Laurent Lefebvre)
`Date : November 26, 2004
`Rev 1.4 (Laurent Lefebvre)
`Date : December 6, 2001
`
`Rev 1.5 (Laurent Lefebvre)
`Date : December 11, 2001
`
`Rev 1.6 (Laurent Lefebvre)
`Date : January 7, 2002
`
`
`
`First draft.
`
`Changed the interfaces to reflect the changesin the
`SP. Added somedetails in the arbitration section.
`Reviewed the Sequencer specafter the meeting on
`August 3, 2001.
`Added the dynamic allocation method for register
`file and an example (written in part by Vic) of the
`flow of pixels/vertices in the sequencer.
`Added timing diagrams (Vic)
`
`the new R400
`
`reflect
`spec to
`Changed the
`architecture. Added interfaces.
`instruction
`Added
`constant
`store management,
`store management, control flow management and
`data dependant predication.
`Changed the control
`flow method to be more
`flexible. Also updated the external interfaces.
`incorporated changes made in the 10/18/01 contro!
`flow meeting. Added a NOP instruction, removed
`the
`conditional_execute_or_jump. Added
`debug
`registers.
`Refined interfaces to RB. Added state registers.
`
`delta
`SEQ-—-SPO interfaces. Changed
`Added
`precision. Changed VGT--SP0 interface. Debug
`Methods added.
`interfaces greatly refined. Cleaned up the spec.
`
`Addedthe different interpolation modes.
`
`Added the auto incrementing counters. Changed
`the VGT—SQ interface. Added content on constant
`management. Updated GPRs.
`Removed from the spec all interfaces that weren't
`directly tied to the SQ. Added explanations on
`constant
`management.
`Added
`PA-—SQ
`synchronization fields and explanation.
`Added more details on the staging register. Added
`detail about
`the parameter caches. Changed the
`call
`instruction to a Conditionnal_cail
`instruction.
`Added
`details
`on
`constant management
`and
`updated the diagram.
`in the SX
`Added Real Time parameter control
`interface. Updated the control flow section.
`New Interfaces to the SX block. Added the end of
`
`clause modifier,
`removed the
`end
`of clause
`instructions,
`
`Rev 1.7 (Laurent Lefebvre)
`Date : February 4, 2002
`Rev 1.8 (Laurent Lefebvre)
`Date : March 4, 2002
`
`Exhibit 2024.doch400_Sequencerdee
`
`71269 Bytes*** © ATI Confidential. Reference Copyright Notice on Cover Page © **
`
`AMD1044_0257139
`
`ATI Ex. 2107
`IPR2023-00922
`Page 5 of 260
`
`ATI Ex. 2107
`
`IPR2023-00922
`Page 5 of 260
`
`

`

`
`
` |
`
`|
`ORIGINATE DATE
`EDIT DATE
`R400 SequencerSpecification
`PAGE
`
`i
`
`24 September, 2001
`
`Blin
`on
`Le A
`4 September, 20154
`
`6 of 48
`
`1. Overview
`It chooses two ALU clauses and a fetch clause to execute, and
`The sequencer is based on the R300 design.
`executes all of the instructions in a clause before looking for a new clause of the same type. Two ALU clauses are
`executed interleaved to hide the ALU latency. Each vector will have eight fetch and eight ALU clauses, but clauses do
`not need to contain instructions. A vector of pixels or vertices ping-pongs along the sequencer FIFO, bouncing from
`fetch reservation station to alu reservation station. A FIFO exists between each reservation stage, holding up vectors
`until the vector currently occupying a reservation station has left. A vector at a reservation station can be chosen to
`execute. The sequencer looks at all eight alu reservation stations to choose an alu clause to execute and all eight
`fetch stations to choose a fetch clause to execute. The arbitrator will give priority to clauses/reservation stations
`closer to the bottom of the pipeline.
`It will not execute an alu clause until the fetch fetches initiated by the previous
`fetch clause have completed. There are two separate sets of reservation stations, one for pixel vectors and one for
`vertices vectors. This way a pixel can pass a vertex and a vertex can pass a pixel.
`
`To support the shader pipe the sequencer also contains the shader instruction cache, constant store, contro! flow
`constants and texture state. The four shader pipes also execute the same instruction thus there is only one
`sequencer for the whole chip.
`
`The sequencer first arbitrates between vectors of 64 vertices that arrive directly from primitive assembly and vectors
`of 16 quads (64 pixels) that are generated in the scan converter.
`
`The vertex or pixel program specifies how many GPRs it needs to execute. The sequencer will not start the next
`vector until the needed spaceis available in the GPRs.
`
`Exhibit 2024 doct400_Sequercerdes
`
`71260 Bytes*** © ATI Confidential. Reference Copyright Notice on Cover Page © **
`
`AMD1044_0257140
`
`ATI Ex. 2107
`IPR2023-00922
`Page 6 of 260
`
`ATI Ex. 2107
`
`IPR2023-00922
`Page 6 of 260
`
`

`

`45||
`
`MALLYA
`
`
`
`
`
`
`
`
`
`
`AMD1044_0257141
`
`ATI Ex. 2107
`IPR2023-00922
`Page7 of 260
`
`
`
`
`
`
`

`
`
`
`
`
`
`
`
`
`
`
`
`valenueBLWLSHOLES
`:LS
`Pr|SuEINiOe|avaOd!|Le-ley—
`ooneeae|oeae"|Se|ayeeeejomALAAAPETKL~—
`xxx@BHC19AODUOSOHONJUBUAdODsoUdIaJOY“JENUSPHUOD[Ly@x84soz1
`
`
`
`
`
`
`po_wTOTx“Ibn|oa>|AMOLSLSNI
`TEee__fOnounSd7
`Wevedsdsds
`
`TVIELVNdadoAALLOdLOUdd
`NI~qiLOM!|dvessouorii_rpee4O~
`—aVLVOBLMXL|go/od|GO/Od|gOsOdfe)PO/Odmn
`
`
`
`
`
`
`
`
`eSNOERNETEEERTSriosWARE
`
`
`-|-|—_aGBS")LUBISUED
`MSLNI—)SELLNI/MSLNI|||
`
`
`
`
`OOKXKI-NADPGLO?Jequigjass»Loog‘equiaydespz
`
`
`dvd‘ANNAdaLNSIINDOdS170LidsSLYSLVYNISIYO
`;
`povmemrennmamemennenemnnennnneone|SSNSALE
`
`
`L_Lvol
`Cn=|
`—cd
`:
`MITALIA0Asouonbeg[e1eueyi]easy
`
`
`
`sepssouenbas“pored0hPLOEWINS
`
`rounoo=|SINWLSNOO
`
`ATI Ex. 2107
`
`IPR2023-00922
`Page 7 of 260
`
`
`
`
`
`
`
`

`

`
`
`
`
`| 24 September, 2001
`|
`ORIGINATE DATE
`
`EDIT DATE
`
`
`
`
`
`4 Seplember, 20154Bilis PO A
`
`|.1 Top Level Block Diagram
`
`R400 SequencerSpecification
`
`PAGE
`
`8 of 48
`
`
`
`Possible delayfor available GPR’s[yg
`
`ey
`IPextare clause 0
`eservationstation
`
`FIFO
`
`P
`
`fexture arbitrator
`
`Pexture clause 1
`eservationstation
`
`
`py
`exture clause 2
`eservation station
`
`
`FIFO.
`[FRO
`exture clause 3
`eservation station
`FIFO
`Local
`(Pextute clause 4
`reservation station.
`
`
`
`
`:
`ALU clause C
`}<-——feservation station
`ened
`ee
`efLU clause t
`reservationstation
`
`
`
`oO
`aS
`[Fre
`FIFO
`
`veriex/pixel vevtor arbitrator
`
`
`
`
`
`
`exture arbitrator
`
`
`
`ALU clause 2
`reservation station
`:
`JALU clause 3
`
`!
`keservation station.
`ee
`i
`ALU clause 4
`
`j
`:
`ARS
`keservationstation
`[Pextuce clause 5
`\
`eservation station
`
`U clause 5 J
`
`La
`FIFO
`
`
`ARS|LOT Texture clause 6reservation station
`
`
`;
`HFO
`eservation station
`LeggALUclause 6

`“
`foscrvation station
`——_—
`rao
`|
`(Pexture clause 7
`4
`
`g
`i
`Ola
`eservation station
`FIFO
`
`Legg—ALU clause 7
`kescrvation station
`
`Be
`
`>
`
`—s
`
`Figure 2: Reservation stations and arbiters
`There are two sets of the above figure, one for vertices and onefor pixels.
`
`Depending on the arbitration state, the sequencer will either choose a vertex or a pixel packet. The control packet
`consists of 3 bits of state, 7 bits for the base address of the Shader program and someinformation on the coverage to
`determine fetch LOD plus other various small state bits.
`
`Exhibit 2024 doct400_Sequercerdes
`
`71260 Bytes*** © ATI Confidential. Reference Copyright Notice on Cover Page © **
`
`AMD1044_0257142
`
`ATI Ex. 2107
`IPR2023-00922
`Page8 of 260
`
`ATI Ex. 2107
`
`IPR2023-00922
`Page 8 of 260
`
`

`

`
`
` A:
`
`ORIGINATE DATE
`
`EDIT DATE
`
`DOCUMENT-REV. NUM.
`
`PAGE
`
`
`
`
`
`9 of 48
`GEN-CXXXXX-REVA
`4 Seplember, 20154
`24 September, 2001
`Pd.
`On receipt of a packet, the input state machine (not pictured but just before the first FIFO) allocated enough space in
`the GPRs to store the interpolated values and temporaries. Following this,
`the barycentric coordinates (and XY
`screen position if needed) are sent to the interpolator, which will use them to interpolate the parameters and place the
`results into the GPRs. Then, the input state machine stacks the packetin the first FIFO.
`
`On receipt of a command, the level 0 fetch machine issues a fetch request to the TP and corresponding GPR
`address for the fetch address (ta). A smail command (tcmd) is passed to the fetch system identifying the current level
`number (0) as well as the GPR write address for the fetch return data. One fetch request is sent every 4 clocks
`causing the texturing of sixteen 2x2s worth of data (or 64 vertices). Once all the requests are sent the packetis put in
`FIFO 1.
`
`Upon receipt of the return data, the fetch unit writes the data to the register file using the write address that was
`provided by the level 0 fetch machine and sends the clause number (0) to the level O fetch state machine to signify
`that the write is done and thus the data is ready. Then, the level 0 fetch machine increments the counter of FIFO 1 to
`signify to the ALU 0 that the data is ready to be processed.
`
`On receipt of a command, the level 0 ALU machinefirst decrements the input FIFO 1 counter and then issues a
`complete set of level 0 shader instructions. For each instruction,
`the ALU state machine generates 3 source
`addresses, one destination address and an instruction. Once the last instruction has been issued, the packet is put
`into FIFO 2.
`
`There will always be two active ALU clauses at any given time (and two arbiters). One arbiter will arbitrate
`over the odd instructions (4 clocks cycles) and the other one will arbitrate over the even instructions (4
`clocks cycles). The only constraints between the two arbiters is that they are not allowed to pick the same
`clause number as the other one is currently working on if the packet is not of the same type (render state).
`
`if the packet is a vertex packet, upon reaching ALU clause 3, it can export the position if the position is ready. So the
`arbiter must prevent ALU clause 3 to be selected if the positional buffer is full (or can’t be accessed). Along with the
`positional data, if needed the sprite size and/or edge flags can also be sent.
`
`A special case is for multipass vertex shaders, which can export 12 parameters per last 6 clauses to the output
`buffer.
`If the output buffer is full or doesn’t have enough space the sequencer will prevent such a vertex group to
`enter an exporting clause.
`
`Multipass pixel shaders can export 12 parameters to memory from the last clause only (7).
`
`All other clauses process in the same way until the packetfinally reaches the last ALU machine(7).
`
`Only one pair of interleaved ALU state machines may have access to the register file address bus or the instruction
`decode bus at one time. Similarly, only one fetch state machine may have access to the register file address bus at
`one time. Arbitration is performed by three arbiter blocks (two for the ALU state machines and onefor the fetch state
`machines). The arbiters always favor the higher number state machines, preventing a bunch of half finished jobs from
`clogging up the registerfiles.
`
`Exhibit 2024,doch409_Sequeneerdas
`
`71260 Bytes** © ATI Confidential. Reference Copyright Notice on Cover Page © »*
`
`AMD1044_0257143
`
`ATI Ex. 2107
`IPR2023-00922
`Page 9 of 260
`
`ATI Ex. 2107
`
`IPR2023-00922
`Page 9 of 260
`
`

`

`
`
` |
`PAGE
`R400 Sequencer Specification
`EDIT DATE
`ORIGINATE DATE
`
`L Be A BA or
`10 of 48
`4 September, 20154
`24 September, 2001
`|
`
`
`1.2 Data Flow graph (SP)
`
`
`
`
`
`~~
`
`
`
`
`
`o
`
`
`
`
`
`=I)¢textureaddress1&
`
`7 scalarinput/output
`
`pipeline stage
`
`
`Register File
`
`|
`texture
`
`ByWo)BjepSARUL
`
`instruction
`
`
`oi
`Register File
`
`|I
`i
`
`fexturere}
`
`pst
`
` i
`
`
`- a\
`fo at
`to Primitive Assembly Unit or RenderBackend
`I
`
`| »
`ee
`
`|
`
`
`
`
`‘
`OEee
`
`
`
`r
`_¥
`1
`( sealat mmputfoutput
`|
`2b
`
`|
`pipeline stage
`tel
`fre requ
`
`
`
`
`
`Register File
`
`
`
`
`
`
`
`
`
`
`instruction
` RegisterFile
`
`
`
`
`
`
`
`
`instruction
` rT ana
`
`
`MAG
`[text equed —>~
`
` ScalarUnit
`instruction
`
` I
`
`
`
`BpepSinxey”quest
`
`\
`
`
`| | pipeline stage
`
`
`
`
`
`Exhibit 2024 doct400_Sequercerdes
`
`71260 Bytes*** © ATI Confidential. Reference Copyright Notice on Cover Page © **
`
`Figure 3: The shader Pipe
`
`AMD1044_0257144
`
`ATI Ex. 2107
`IPR2023-00922
`Page 10 of 260
`
`ATI Ex. 2107
`
`IPR2023-00922
`Page 10 of 260
`
`

`

` | ORIGINATE DATE EDIT DATE
`
`
`
`‘ The gray area represents blocks that are replicated 4 times per shader pipe (16 times on the overall chip).
`
`|
`
`24 September, 2001
`
`&
`
`
`
`4 September, 20154A Ee
`
`DOCUMENT-REV. NUM.
`
`GEN-CXXXXX-REVA
`
`PAGE
`
`11 of 48
`
`1.3 Control Graph
`
`Clause # + Rady
`WrAddr
`
`CMD
`
`cst
`
`_——___—
`
`IS
`
`|
`|
`
`SEQ
`
`_
`
`
`cs
`
`|
`|
`
`WrAddr
`
`|
`
`
`Phase:
`H
`|
`cmp SSTestzestipx &
`© Wrveo |
`
`| _
`| WrSeal wader
`4
`Bo
`
`RdAddr
`
`8
`
`FETCH
`
`SP
`
`OF
`
`WrAddr:
`
`Figure 4: Sequencer Control interfaces
`
`in red the ALU control interface, in blue the Interpolated/Vector
`In green is represented the Fetch control interface,
`control interface and in purple is the outputfile control interface.
`
`2. Interpolated dala bus
`The interpolators contain an IJ buffer to pack the information as much as possible before writing it to the register file.
`
`Exhibit 2024 dockt4o0_Sequercerdes
`
`71260 Bytes*** © ATI Confidential. Reference Copyright Notice on Cover Page © **
`
`AMD1044_0257145
`
`ATI Ex. 2107
`IPR2023-00922
`Page 11 of 260
`
`ATI Ex. 2107
`
`IPR2023-00922
`Page 11 of 260
`
`

`

`
`
`
`
`To RB
`
`ORIGINATE DATE
`
`EDIT DATE
`
`R400 Sequencer Specification
`
`
`
`4 September, 20154ra EUS A
`
`
`
`24 September, 2001
`
`
`
`PAGE
`12 of 48
`
`
`
`
`
`
`
`
`ot
`bz
`EG
`i=
`
`1
`T
`.
`i
`
`'
`FIX-FLOAT + EXPANSION
`pe a “
`
`di
`1
`a
`
`512
`“|/-
`a
`|
`
`
`i.
`oy
`on
`Ht
`i
`|
`i
`j
`j
`i
`f
`3u
`aur |
`3UR
`4LR
`
`
`|
`
`|
`
`|e
`
`
`
`XA
`
`Figure 3: Interpolation buffers
`
`Exhibit 2024. doch490_Sequencereiac
`
`71269 Byes*** © ATI Confidential. Reference Copyright Notice on Cover Page © »*
`
`AMD1044_0257146
`
`ATI Ex. 2107
`IPR2023-00922
`Page 12 of 260
`
`.
`Aa
`Al
`Ag
`Be
`
`
`Bt
`ce
`ct
`€2
`
`
`C3
`
`o4
`
`cs
`
`bo
`
`D1
`
`i
`
`b2
`
`EG
`
`i
`
`|
`INTERPOLATORS
`
`1
`
`|
`
`|
`
`
`
`|f
`
`2
`
`3
`
`4
`
`
`ae
`
`lds CROSSBAR (4x64 bits}
`TT a= a !
`to Segaa
`
`$n pt - aee iee
`
`| STEELEa
`—— To ee
`
`— SSE
`1
`Us buffer (ging-pong buffer)
`i
`(28 bits * 2 (15) + bits * 6 (delta Ue)+4 &
`Ag
`At
`AZ
`BO
`i
`bits*6}* 16 (quads) * 2 (doubie-butfered)
`
`|
`4096bits
`32x 128
`|'
`
`Bt
`
`co
`
`ci
`
`Ys buffer (ging-pong buffer)
`24 bits * 16 quads * 2
`768 bits
`sed
`
`C4
`3
`—____}__
`
`C5
`
`_
`
`C2
`
`Do
`
`i
`I
`|
`
`
`
`
`
`ATI Ex. 2107
`
`IPR2023-00922
`Page 12 of 260
`
`

`

`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`WIt.iselpSuuUOREOd.AIUy79sans]
`
`
`
`
`
`
`
`
`
`
`
`xix@OBEY19A0DUOOHION1YUBUAdODsoUdIOJOY"PENUSPYUOT[LY@wacozl,—semseouonbes“gornsePPLOTTITS
`
`
`
`
`
`eeliecl[21OZL/6hL.StLél)bid
`
`
`
`
`TVIELVNdadoAALLOdLOUdd
`
`
`
`
`
`
`
`
`oceaniaocrencaacereeseanancnsaceaaaaaeaeaaatitaoAGASSIEEFERRIEREEFNSYPacaaaaaoSOSCESoCo
`
`
`
`
`87JOELWAREXXXXXO-NAOVGLO?JequielaesLO0g‘Jaquiaydes77
`
`
`dvdWON(AdaLNSINNOOGSLVdLidaSLYSLVNISIO
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`AMD1044_0257147
`
`ATI Ex. 2107
`IPR2023-00922
`Page 13 of 260
`
`ATI Ex. 2107
`
`IPR2023-00922
`Page 13 of 260
`
`
`

`

`
`
`
`
` ORIGINATE DATE See
`
`
`
`PAGE
`EDIT DATE
`14 0f 48
`4 September, 20154
`24 September, 2001
`Above is an example of a tile the sequencer might receive from the SC. The write side is how the data get stacked
`into the XY and IJ buffers, the read side is how the data is passed to the GPRs. The IJ information is packed in the IJ
`buffer 4 quads at a time or two clocks. The sequencer allows at any given time as many as four quadsto interpolate a
`parameter. They all have to come from the same primitive. Then the sequencer controls the write mask to the GPRs
`to write the valid data in.
`
`R400 Sequencer Specification
`
`{ISSUE : Do we do the center + centroid approach using both lJ buffers?}
`
`Instruction Store
`3.
`There is going to be only oneinstruction store for the whole chip. It will contain 4096 instructions of 96 bits each.
`
`It is likely to be a 1 port memory; we use 7 clock to load the ALU instruction, 1 clocks to load the Fetch instruction, 1
`clock to load 2 control flow instructions and 1 clock to write instructions.
`
`The instruction store is loaded by the CP thru the register mapped registers.
`
`The next picture shows the various modes the CP can load the memory. The Sequencer has to keep track of the
`loading modes in order to wrap around the correct boundaries. The wrap-around points are arbitrary and they are
`specified in the VS_BASE and PIX_BASE control registers. The VS_BASE and PS_BASE context registers are used
`to specify for each context whereits shaderis in the instruction memory.
`
`For the Real time commands the story is quite the same but for some small differences. There are no wrap-around
`points for real time so the driver must be careful not to overwrite regular shader data. The shared code (shared
`subroutines) uses the same path as real time.
`
`Exhibit 2024 doct400_Sequercerdes
`
`71260 Bytes*** © ATI Confidential. Reference Copyright Notice on Cover Page © **
`
`AMD1044_0257148
`
`ATI Ex. 2107
`IPR2023-00922
`Page 14 of 260
`
`ATI Ex. 2107
`
`IPR2023-00922
`Page 14 of 260
`
`

`

`xix@OBEY19A0DUOOHION1YUBUAdODsoUdIOJOY"PENUSPYUOT[LY@wacozl,—semseouonbes“gornsePPLOTTITS
`
`
`
`
`
`igquanbesossysoig
`
`
`
`-qngayeudoidde
`
`
`
`HIE}S0}SJOYA\SMAOUY
`
`
`
`‘apooeu;Bugnoexe
`
`
`
`LOOZ/PL/L}‘peyepdn
`
`LJB}SEposSOMAOwe8sponsa0]Sessaippe
`
`
`
`0]Sassalppe-qngayeudoidde
`
`sevet,ALOWS/\]UONONISU]JOSMAIAS,dDCOPY
`
`99p0dSd|8p0DSd{*3epodSACx¥oPOOSe
`
`
`TVIELVNdadoAALLOdLOUdd
`
`
`oRpauls|,BuryeiBurs-|aoBuryeng-03qoN
`
`
`
`
`
`
` SpoPeleus.apogpeieus-3Sv@YSCVHSXALYSA
`
`earnSEESERRSEEREYTEnaaaae810SLWARE
`
`V8Pp0DSAP09SA
`
`
`
`XXXXXO-NAOVGLO?JequielaesLO0g‘Jaquiaydes77
`
`
`
`dvdWON(AdaLNSINNOOGSLVdLidaSLYSLVNISIO
`gd8podSdJgquenbes08S901g
`
`
`SAQUIIULUOLIMLIPSULIY)JOMBIASof)DULLtLDUNS
`S60PoTS60r
`amLeASWUSCVHSXSLYSA
`
`URIS0]BJ8UMSMOU
`LBYSSpodSAMHO
`‘apooau)Buynoaxe
`—4
`
`
`
`3eP0DSA
`28P0DSd
`aed
`
`
`aSvaYSQVHSTSXid
`
`
`
`
`
`
`
`
`
`
`
`
`
`AMD1044_0257149
`
`ATI Ex. 2107
`IPR2023-00922
`Page 15 of 260
`
`ATI Ex. 2107
`
`IPR2023-00922
`Page 15 of 260
`
`
`
`
`
`
`
`
`

`

`
`
` |
`PAGE
`R400 SequencerSpecification
`EDIT DATE
`ORIGINATE DATE
`|
`i Le A Blin on
`
`
`24 September, 2001
`4 September, 20154
`16 of 48
`
`
`
`
`
`4 Sequencer Instructions
`All control flow instructions and moveinstructions are handled by the sequencer only. The ALUs will perform NOPs
`during this time (MOV PV,PV, PS,PS) if they have nothing else to do.
`
`5 Constant Stores
`
`5.1 Memory organizations
`A likely size for the ALU constant store is 1024x128 bits. The read BW from the ALU constant store is 128 bits/clock
`and the write bandwidth is 32 bits/clock (directed by the CP bus size not by memory ports).
`
`The maximum logical size of the constant store for a given shaderis 256 constants. Or 512 for the pixel/vertex shader
`pair. The size of the re-mapping table is 128 lines (each line addresses 4 constants). The write granularity is 4
`constants or 512 bits.
`It takes 16 clocks to write the four constants. Real time requires 256 lines in the physical
`memory (this is physically register mapped).
`
`The texture state is also kept in a similar memory. The size of this memory is 128x192 bits. The memory thus holds
`128 texture states (192 bits per state). The logical size exposes 32 different states total, which are going to be shared
`between the pixel and the vertex shader. The size of the re-mapping table to for the texture state memory is 32 lines
`(each line addresses 1 texture state lines in the real memory). The CP write granularity is 1 texture state lines (or 192
`bits). The driver sends 512 bits but the CP ignores the top 320 bits.
`it thus takes 6 clocks to write the texture state.
`Real time requires 32 lines in the physical memary (this is physically register mapped).
`
`The control flow constant memory doesn’t sit behind a renaming table. It is register mapped and thus the driver must
`
`hold 8 copies of the 32 dwords of control flow constants and the loop construct constants must be aligned.
`
`The constant re-mapping tables for texture state and ALU constants are logically register mapped for regular mode
`and physically register mapped for RT operation.
`
`“oS ee
`5.2 Management of the Control Flow Constants
`The controlflow constantsareregistermapped,thusthe CPwritesto the according registertoset the constant, the
`$Q@ decodes the address and writes to the block pointed by
`its current base pointer
`(CF VWWR BASE). On the read
`
`
`side, one level of indirection is used. A register (SQ CONTEXT MISC.CF RD BASE) keeps the current base pointer
`
`to the control flow block, This register is copied whenever there is a siale change. Shouls the CP write to CF aller the
`state change, the base register is updated with the (current
`pointer number +1)% number of states. This way,
`if the
`
`CP doesn't write fo CF the state is going to use the previous CF constants.
`
`
`
`
`Ee _ 4Formatted: Bullets and Numbering
`
`“
`
`$25.3Management of the re-mapping tables
`
`3-2-15.3.1 R400 Constant management
`The sequencer is responsible to manage two re-mapping tables (one for the constant store and one for the texture
`state). On a state change (by the driver), the sequencerwill broadside copy the contents ofits re-mapping tables to a
`new one. We have 8 different re-mapping tables we can use concurrently.
`
`The constant memory update will be incremental, the driver only need to update the constants that actually changed
`between the two state changes.
`
`For this model to work in its simplest for

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket