`
`
`BEFORE THE PATENT TRIAL AND APPEAL BOARD
`
`
`LG ELECTRONICS, INC.,
`Petitioner
`
`Vv.
`
`ATI TECHNOLOGIES ULC,
`Patent Owner
`
`
`
`Case IPR2015-00325
`Patent 7,742,053
`
`
`DECLARATIONOF INVENTOR LAURENT LEFEBVRE
`REGARDING THE INVENTION DATE OF U.S. PATENT NO.7,742,053
`
`Mail Stop “Patent Board”
`Patent Trial and Appeal Board
`U.S. Patent and Trademark Office
`P.O. Box 1450
`Alexandria, VA 22313-1450
`
`ATI 2006
`
`LGv. ATI
`
`IPR2015-00325
`
`AMD1044_0011556
`
`ATI Ex. 2004
`IPR2023-00922
`Page 1 of 61
`
`ATI Ex. 2004
`
`IPR2023-00922
`Page 1 of 61
`
`
`
`Table of Contents
`
`BACKGROUND 0000. occceccccecc cece cence sees ceeecseeeseceeeeaecaeciecesseaesaeesstereneeeeteeeeees I
`
`I,
`
`CONCEPTION .oooi cence cccccecceececcssceeseceecaecreceseesecsaesaseseesascaeeseceseeaseeenaseeestens 3
`
`A.—R400 Architecture Proposal ...00000000000cocccccec cece eect ce ceteeeetnteseeteeeeens 4
`
`B
`
`C.
`
`D
`
`R400 Top Level Specification... eee ccc cececeteccecseeenteteneeees 4
`
`R400 Shader Processor... cccceccecec cece ee cece eens sees ceeeeeeceneeeseeesseeaes 6
`
`R400 Sequencer Specification 2.0... ceccccceceseesseeesceeseessesseesnseesseeses 6
`
`1.
`
`2.
`
`R400 Sequencer Specification (Version 0.4): August
`24, 200 Dee cccecccecencenceeenseneceseessceseaessecsaccaeeeeesseceseeseeaeeeenas 7
`
`R400 Sequencer Specification (Version 2.0): April 19,
`2002 occ ccc cece eee ceeseeeeseceseeeccaeseserteaesascaeeesenssesesieseseenteees 17
`
`Tif.
`
`DILIGENCE oo eccccccceeccecccccececccceesceeescenseeaeeaeenseeseeasceeccaeereseeseaeserenieeneseetenees 19
`
`A.
`
`I Periodically Updated the R400 Sequencer Specification ...........0..... 20
`
`B. My Colleagues and I Continuously Developed and
`Debugged Emulation Code and RTL Code for the R400 .2000000000.... 21
`
`IV.
`
`
`TESTING SHOWED THAT THE RTL IMPLEMENTATION
`WORKEDFORITS INTENDED PURPOSE ..0....cocccceceeceereceeceeteeteeees 26
`
`DILIGENCE CALENDAR 0... .ooccccccceccececec cece eeeceeceecceeeseeseceeecaeenseeeeeneeerees 31
`
`Vi
`
`EXHIBITS ooo ccccccceccceceeccsceseeeeceaecneesseaecaeccreesesasceeecaeetseeeceaeseeeneeeneseeseaees 57
`
`ii -
`
`AMD1044_0011557
`
`ATI Ex. 2004
`IPR2023-00922
`Page 2 of 61
`
`ATI Ex. 2004
`
`IPR2023-00922
`Page 2 of 61
`
`
`
`I, Laurent Lefebvre, declare as follows:
`
`I BACKGROUND
`
`1.
`
`I am a computer-graphics hardware architect at AMD Inc. I have been
`
`designing computer-graphics processors for the past fifteen years. I specialize in
`
`sequencers, shaders, 3D-computer graphics, and integrated-circuit design.
`
`2.
`
`From September 2000 to November 2006, I worked as an engineer
`
`and hardware architect for ATI Technologies Inc. (‘ATT’). It is my understanding
`
`that ATI hired me to develop technologies for the R400, which is a graphics
`
`processor.
`
`3.
`
`Unlike conventional graphics processorsat the time, the R400 used a
`
`unified shader for both pixel commands and vertex commands—twotypesof
`
`commands required to produce an image. Conventional graphics processors had
`
`separate shaders for pixel commands and vertex commands. But a unified shader,
`
`like the R400’s unified shader, enhances functionality and efficiency by allowing
`
`the same shader complex to be used for both pixel commandsand vertex
`
`commands.
`
`4.
`
`The R400 includes many different functional blocks(e.g., the
`
`sequencer, shader pipe, primitive assembly, texture cache, texture pipe, raster
`
`engine, display, etc.). See, e.g., Ex. 2053, p. 6. The PowerPointslide titled Block
`
`-l-
`
`AMD1044_0011558
`
`ATI Ex. 2004
`IPR2023-00922
`Page 3 of 61
`
`ATI Ex. 2004
`
`IPR2023-00922
`Page 3 of 61
`
`
`
`Responsibility (reproduced below) showsthe ATI office responsible for designing
`
`eachblock.
`
`
`
`
`Block Responsibility
`
`&
`
`Bm aokoen OFCee
`
`
`
`
`CONFIDENTIAL
`
`| fF Sese Conair |
`
`
`
`we
`i VideolnpadPort
`
`(tet |
`eee CHE).
`fever
`
`Id.
`
`5.
`
`For the R400 project, I was responsible for the sequencer block,
`
`whichis the block that manages the execution of pixel commandsand vertex
`
`commands for the unified shader. In particular, I drafted the high-level
`
`specification that describes the sequencer block’s functionality, and | wrote
`
`emulator code for the sequencer block. In addition, I was also co-responsible for
`
`emulating the shader pipe block and the export block.
`
`AMD1044_0011559
`
`ATI Ex. 2004
`IPR2023-00922
`Page 4 of 61
`
`ATI Ex. 2004
`
`IPR2023-00922
`Page 4 of 61
`
`
`
`6.
`
`I am one of the named inventors of U.S. Patent No. 7,742,053 (‘the
`
`°053 patent”). The other named inventors are Steve Morem and Andy Gruber. We
`
`collectively conceived of the graphics-processing system claimed in the °053
`
`patent no later than August 24, 2001, while working on the R400. See infra PartIL.
`
`A team of my colleagues andI, which totaled about one hundred engineers,
`
`worked on the R400 nearly every business day from at least August 24, 2001 to
`
`September 29, 2003. See infra Parts II, V. No later than the third quarter of 2002,
`
`we made a GPU inregister-transfer-language (“RTL”) code that worked to process
`
`a first tangle. See infra Part IV.
`
`ll. CONCEPTION
`
`7.
`
`No later than August 24, 2001, Steve Morein, Andy Gruber, and I
`
`collectively conceived of the graphics-processing system in the ’053 patent. We
`
`each contributed different aspects to this system. Steve Morein came up with the
`
`idea for a unified shader. This is shown, for example, in documentstitled “R400
`
`Architecture Proposal” and “R400 Top Level Specification.” Ex. 2040, p. 1; Ex.
`
`2041, p. 1. Andy Gruber wasthe lead for the shader processor. This is shown, for
`
`example, in a documenttitled “Shader Processor.” Ex. 2042, p. 1. And I wasthe
`
`lead for the sequencer block. This is shown, for example, in a documenttitled
`
`“R400 Sequencer Specification.” £.¢., Ex. 2007, p. 1. l explain each of these
`
`documents in turn below.
`
`AMD1044_0011560
`
`ATI Ex. 2004
`IPR2023-00922
`Page 5 of 61
`
`ATI Ex. 2004
`
`IPR2023-00922
`Page 5 of 61
`
`
`
`ee
`
`A. R400 Architecture Proposal
`
`8.
`
`Steve Morein authored the “R400 Architecture Proposal.” Ex. 2040,
`
`p. 1. In this proposal, the R400 includes a unified shader that performs both pixel
`
`operations and vertex operations. See id. at 9. The R400 also includes a unified
`
`processing pipe (i.e., a single programmable pipeline for 2D video, 3D vertex, and
`
`3D pixel operations). See id. at 6 (“The most ambitious feature in this design is the
`
`‘truly unified pipe’: a single programmable pipeline.”). With a single pipeline for
`
`both pixel commands and vertex commands, the graphics processor had higher
`
`color precision and the ability to support more registers, compared with a
`
`conventional graphics pipeline. Seeid.
`
`B. R400 Top Level Specification
`
`9.
`
`Steve Morein also authored the “R400 Top Level Specification,”
`
`whichsets forth the high-level architecture for the R400. Ex. 2041, p. 1. As shown
`
`in this document, the R400 Top Level Specification includes a sequencer. See e.g.,
`
`id. at 27-28, 30. The sequencer managesthe instructions for the unified shader. See
`
`id. at 11 (“Before starting the processing .
`
`.
`
`. the rasterizer (which includes the
`
`sequencer for the shader pipeline) checks to make sure that there are enough free
`
`registers in the pipeline for the pixel shader program.”), 27 (“The raster engine...
`
`contains the sequencerfor the shader pipe.”).
`
`AMD1044_0011561
`
`ATI Ex. 2004
`IPR2023-00922
`Page 6 of 61
`
`ATI Ex. 2004
`
`IPR2023-00922
`Page 6 of 61
`
`
`
`10.
`
`The R400 Top Level Specification includes a block diagram of the
`
`sequencer’s control flow (reproduced belowfor reference).
`
`vertex‘pinel vector arbitrator
`
`
`¥
`
`Possible delay for available GPRS] sg
`
`
`
`
`
`
`
` PeSEAVALLON Station
`
`
`
`
`
`
`[FexhweclauseO
`taal
`FIFO
`a
`reservation station
`wat FIPO|ag
`
`JALU clause &
`aij—reservatioa: station
`
`Sie[>FO|][Fexture clause 1 '
`FIFO
`reservation station
`pextice arbutrator
`wl FIFO|sag
`
`feservation stationlagALU clause 1
`
`FIFO
`a
`[>|FO|[Denture clause 2
`fexcture arbitrator
`
`reservation station
`al FIFO|a
`
`
`LgALUclause 2
`feservation station
`
`FRO
`a
`[>FO|[Denture clause: 3
`
`reservation station
`FIPO |g
`aa
`
`jwj—ALU clause 3
`feservation station
`
`»
`FRO
`[>Fe|[Texture clause 4
`
`reservation station
`a FLEO|ag
`
`JU chuase 4
`
`keservation station
`
`FIG a}——FEO|[Texture clause 5
`
`
`PeSEAVALLON Station
`tt FLEO|ag
`
`JU clause 5
`
`keservation station
`FIG aLo [Texture clause &
`
`
`PESELVRTLON StatiOHt
`FIFO |g
`i
`JU chuase &
`
`keservation station
`
`PRG i}——>|FEO|[Fexture clause 7
`
`
`
`|gALD clause 7
`keservatiom station
`
`Td. at 30.
`
`AMD1044_0011562
`
`ATI Ex. 2004
`IPR2023-00922
`Page 7 of 61
`
`ATI Ex. 2004
`
`IPR2023-00922
`Page 7 of 61
`
`
`
`11.
`
`This block diagram includesthree arbitrators': (1) a “vertex/pixel
`
`vector arbitrator” at the top of the diagram; (2) a “texture arbitrator” on the left side
`
`of the diagram; and (3) a “texture arbitrator” on the right side of the diagram. See
`
`id. The “texture arbitrator” on the left side is mislabeled. This arbitrator should be
`
`labeled “ALU arbitrator” to correspond to the ALU reservation stations. I describe
`
`the control flowfor this block diagram in Part I[.D.1 of this declaration.
`
`C. R400 Shader Processor
`
`12. Andy Gruber authored the “Shader Processor,” which describes the
`
`shader architecture, interfaces, partitioning, and timing of the shader. Ex. 2042,p.
`
`1. The shader processor, also called a pipeline, has sixteen pipes. See id. at 5
`
`(“There will be four sets of four shader pipes.”). This pipeline processes a
`
`sequence of instructions in both texture clauses and ALU clauses. /d. at 8
`
`D. R400 Sequencer Specification
`
`13.
`
` Iauthored the “R400 Sequencer Specification,” which is the
`
`architectural specification for the R400 sequencer block. Ex. 2007, p. 1. There are
`
`
`
`' The term “arbitrator”is interchangeable with the term “arbiter.” See,e.g.,
`
`Ex. 2023, p. 10 (identifying the control flow diagram as: “Figure 2: Reservation
`
`stations and arbiters”).
`
`AMD1044_0011563
`
`ATI Ex. 2004
`IPR2023-00922
`Page8 of 61
`
`ATI Ex. 2004
`
`IPR2023-00922
`Page 8 of 61
`
`
`
`at least thirty three revisions of this specification. See Ex. 2039, pp. 4-5; Exs. 2007-
`
`38. Each revision updates the specification.
`
`14.
`
`I developed two versions of the sequencer’s control flow. See Ex.
`
`2010; Ex. 2028. WhenI filed the patent application that led to the 053 patent,I
`
`intended this patent application to cover both versions of the sequencer’s control
`
`flow.
`
`15.
`
`The first version, described in Version 0.4 of the R400 Sequencer
`
`Specification, was designed for sixteen vertex clauses and sixteen pixel clauses.
`
`See Ex. 2010, pp. 5, 14-15. ATI presented this version to Microsoft to see whether
`
`the R400 was compatible with the application-programminginterface (“API”) that
`
`Microsoft was developing, called DX10. Microsoft rejected this version because
`
`Microsoft’s API required a sequencer that could process an unlimited number of
`
`clauses. To be compatible with this requirement, I developed a second version of
`
`the sequencer control flow. This second version is described in Version 2.0 of the
`
`R400 Sequencer Specification. See Ex. 2028. I explain these versions in turn
`
`below.
`
`1, R400 Sequencer Specification (Version 0.4): August 24, 2001
`
`16. Version 0.4 of the R400 Sequencer Specification is dated August 24,
`
`2001. See Ex. 2010, p. 3; see also Ex. 2043 (for the log entry on August 24, 2001).
`
`AMD1044_0011564
`
`ATI Ex. 2004
`IPR2023-00922
`Page 9 of 61
`
`ATI Ex. 2004
`
`IPR2023-00922
`Page 9 of 61
`
`
`
`This version includes the same sequencer-block diagram as the sequencer-block
`
`diagramin the R400 Top Level Specification (reproduced again belowfor
`
`convenience). Compare Ex. 2010, p. 5 with Ex. 2041, p. 30. Version 0.4 also
`
`includes an example of the flowof pixels and vertices through the system. See Ex.
`
`2010, pp. 3, 17-19.
`
`|__|
`
`
`
`vertes/pimel vector arbitrator
`
`
`
`¥
`Possible delayfor avattabble GPR3)gg——_$_$_$_$___
`
`
`
`repervadecn chaticen
`
`
`
`
`
`
`
`
`
`
`FEES |____t
`
` [Fextuce clance & >
`peservation station:
`
`FIFO
`
`
`aeclauge0
`i
`+
`poservalion stuticn
`FEO
`reservation station
`>
`»[Fexture clause 1
`
`
`ALU clause i
`peservation station
`FIFO:
`
`|
`Pirecchave clause 7
`
`
`fire
`pevervation station:
`
`eoerventice statica:
`“*
`ALU clause 2
`~
`
`FIFC
`reservation station:
`[Fexture clause: 3
`|
`
`
`{aqTALE clanse 3
`arervation station
`FIFG
`
`al
`PMrcsstuve chawe 4
`
`TES
`peservation station
`
`*
`i
`taal ALL clause 4
`evervalion station
`
`FIFO
`
`|
`PFTextuce clanse 5
`
`FEC
`reservation station:
`
`pevervalion station
`faiALU clause 5
`i
`~~
`FIFO
`
`al
`PPiresctuse clause 5
`eservation station
`FIO
`
`
`peclaused
`#
`i“
`eservalion sheilom
`
`>
`FIG
`»
`Presence clause 7
`eservation station:
`
`
`|
`
`feshie: arbitrator
`
`a
`
`|
`
`=!
`
`»
`
`pe!
`
`iaachire arbitrator
`
`ai
`
`wyAL LUclause 7
`
`Ex. 2010, p. 5.
`
`AMD1044_0011565
`
`ATI Ex. 2004
`IPR2023-00922
`Page 10 of 61
`
`ATI Ex. 2004
`
`IPR2023-00922
`Page 10 of 61
`
`
`
`17.
`
`The sequencer has twosets of reservation stations, one for pixels and
`
`one for vertices. /d. at 4, 5. A representation of the two sets of reservation stations
`
`is Shown below. Each set has eight ALU reservation stations and eight texture
`
`reservation stations. /d. Each reservation station stores clauses. /d. These clauses
`
`contain a sequence of instructions. /d. at 4 (‘[the sequencer]... executes all of the
`
`instructions in a clause”); see also Ex. 2042, p. 8 (“instructions in a clause will be
`
`executed sequentially”).
`
`Pixel Reservation Stations
`
`Vertex Reservation Stations
`
`eivione maleated |
`
`poetunonataion || bE
`
`
`II
`|II
`||tI
`satone arbancanes |
`
`an
`
`|
`
`18.
`
`Clauses flow down each set of reservation stations. See Ex. 2010, pp.
`
`4,5. Pixel clauses flow down the set of pixel reservation stations, and vertex
`
`clauses flow downthe vertex reservation stations. See id. Reservation stations
`
`touch the arbiter, so the ALU arbiters and the texture arbiters can select clauses
`
`traveling down the reservationstations. See id. at 17 (‘the control packet continues
`
`9.
`
`AMD1044_0011566
`
`ATI Ex. 2004
`IPR2023-00922
`Page 11 of 614
`
`ATI Ex. 2004
`
`IPR2023-00922
`Page 11 of 61
`
`
`
`to travel down the path of reservation stations until all clauses have been
`
`executed”).
`
`19.
`
`The arbiter/arbitration logic has two levels of arbitration, collectively
`
`shownin red on the figure below.
`
`Vertex Clauses
`
`Pixel Clauses
`
`
`pe
`
`
`preten sabaistes
`
` meee
`
`ie csrmexe sin2
`
`op
`enereon: aie
`
`mad
`a
`os
`
`we
`~
`)
`site
`
`
`
`
`se
`fremaon stahon
`“FRG
`“PIES
`”
`fas heritSete,
`a ~Phessersese 3
`neTEoo este
`
`
`
`snore bts
`‘peseratient veition:
`ALAattention bed
`:
`pom BE eta 2
`:
`Peak
`i
`
`larson S
`bs Au
`jesenvation aanen,
`i
`AL AE te
`ag
`te 3
`fen
`
`
`
`=
`
`ewer
`
`—
`
` &
`
`e
`
`aFenmene Shem8
`
`a.
`= :“HT
`
`“SEO arbitrates between the Pixel FIFO and the Vertex FIFO”
`
`20.
`
`‘The first level of arbitration is between ALU clauses and texture
`
`clauses for both the vertex set of reservation stations and the pixel set of
`
`reservation stations. This first level of arbitration is represented by the ALU
`
`arbitrators and texture arbitrators labeled in the figure above. ALU arbitration logic
`
`-10-
`
`AMD1044_0011567
`
`ATI Ex. 2004
`IPR2023-00922
`Page 12 of 61
`
`ATI Ex. 2004
`
`IPR2023-00922
`Page 12 of 61
`
`
`
`chooses one of the eight potentially pending ALU clauses stored within the ALU
`
`reservation stations. See id. at 14-15. Texture arbitration logic chooses one of the
`
`eight potentially pending texture clauses stored within the texture reservation
`
`stations. See id. at 14.
`
`21.
`
`For the secondlevel of arbitration, the arbitration logic selects
`
`between the pixel and the vertex. See id. at 17 (2) (SEQ arbitrates between the
`
`Pixel FIFO and the Vertex FIFO”), 18 (4) (SEQ arbitrates between Pixel FIFO
`
`
`and Vertex FIFO”). So, not only does the arbiter select which clauses to execute,
`
`the arbiter also selects which order to execute pixels and vertices. See id. at 4 (“a
`
`pixel can pass a vertex and a vertex can pass a pixel”).
`
`22.
`
`The ALUarbitration and the texture arbitration give priority to
`
`reservation stations/clauses closer to the bottom of the pipeline. See id. at 4. After
`
`this arbitration selects winning pixel and vertex clauses, the pixel/vertex arbitration
`
`logic selects between the pixel and the vertex. /d. at 17 (2), 18 (4). Vertices
`
`generally have priority. /d. at 17 (2). Whena vertex is not pending or the register
`
`files do not have open space fora vertex, the arbiter selects a pixel. /d. at 18 (4).
`
`23.
`
`Once arbitration logic selects the pixel/vertex clauses, the sequencer’s
`
`arbitration logic provides the clauses to a register file in the shader pipe. See id. at
`
`pp. 17 (5) (SEQ constructs a control packet for the vector and sendsit to the first
`
`-ll-
`
`AMD1044_0011568
`
`ATI Ex. 2004
`IPR2023-00922
`Page 13 of 61
`
`ATI Ex. 2004
`
`IPR2023-00922
`Page 13 of 61
`
`
`
`reservation station (the FIFO in front of texture state machine 0, or TSM0 FIFO)
`
`the control packet contains the state pointer, the tag to the position cache and a
`
`register file base pointer.”), 17 (9) (“ASMOaccepts the control packet (after being
`
`selected by the ASM arbiter) and gets the instructions for ALU clause 0 from the
`
`global instruction store’).
`
`24.
`
`The shader pipe, as of Version 0.4 of the R400 Sequencer
`
`Specification (reproduced belowfor reference), has four physical register file
`
`memories per shader pipeline. /d. at 10.
`
`AMD1044_0011569
`
`ATI Ex. 2004
`IPR2023-00922
`Page 14 of 61
`
`ATI Ex. 2004
`
`IPR2023-00922
`Page 14 of 61
`
`
`
`
` re
`
`framRE
`
`
`
`
`
`
`
`
`ureaddmss
`
`
`
`
`
`
`
`
`
`ig
`=
`
`Ld—
`
`
`at[ram
`e
`5
`a
`=
`
`5
`=
`2
`z
`
`Regisheree
`
`
`
`
`
`
`flux
`
`(
`
`tePrirsitiveAsserebiyUnitor RendarBackend
`
`Id.
`
`25.
`
`Each register file is coupled to a bank of ALUs. See id. at 11. The
`
`gray area of the Figure reproduced belowshowsthelogical viewof the four
`
`register files within the shader pipe as software would seeit. /d. The Figure also
`
`-13-
`
`AMD1044_0011570
`
`ATI Ex. 2004
`IPR2023-00922
`Page 15 of 61
`
`tants
`
`ingiructicn
`
`= :=
`
`pipeline
`
`=a
`iS
`E
`
`ta|=
`=ca
`SS |
`a
`A
`L&~—1
`(So5Ernpabypat}a|
`pipeline stage
`=riry
`
`
`Register Fle
`
`ATI Ex. 2004
`
`IPR2023-00922
`Page 15 of 61
`
`
`
`shows an ALU bank, a texture unit, an instruction store/cache, and a constantstore.
`
`See id.
`
`onterpotateet
`data /Verike Plexes:
`
`
`
`
`
`REGETERALE
`|
`Poeee
`To |=hy
`
`
`
`
`
`
`
`
`
`
`
`:
`
`¥
`OPCRANDMS
`
`
`5
`
`| as
`
`5 |
`
`wove
`
`Td.
`
`26.
`
`Inthe Figure reproduced above, the sequencer block comprises the
`
`instruction store and the constant store. In a different representation, reproduced
`
`below, the instruction store and the constant store are within the sequencer block.
`
`-14-
`
`AMD1044_0011571
`
`ATI Ex. 2004
`IPR2023-00922
`Page 16 of 61
`
`ATI Ex. 2004
`
`IPR2023-00922
`Page 16 of 61
`
`
`
`
`
`
`
`=A BOG Wes
`WeSeel
`
`CMP CST! csp
`
`Wrdade
`
`Ficiciel
`PARE
`
`yyeadeir
`
`OF
`
`Clouse + Rady
`
`Wieck
`> oMD
`
`cst
`
`
`
`Td. at 12.
`
`27.
`
`Later versions of the R400 Sequencer Specification showthe
`
`sequencer and the shader pipe within the R400 architecture. See, e.g., Ex. 2012, pp.
`
`3, 5. The architecture in Version 0.6 of the R400 Sequencer Specification is
`
`reproduced below.
`
`-[5-
`
`AMD1044_0011572
`
`ATI Ex. 2004
`IPR2023-00922
`Page 17 of 61
`
`ATI Ex. 2004
`
`IPR2023-00922
`Page 17 of 61
`
`
`
`
`
`
`
`RE
`
`
`
`COWTROL
`+
`2 OUABSHe
`
`\
`LU CROSSBAR
`
`5
`3
`4
`*
`
`34
`
`
`
`
`
`
`
`
`
`
`
`NTER }—+ INTER +)
`INTER |—+
`
`kd
`¥
`q
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Stak,
`
`VERTEX
`CONT REE,
`
`ene BE
`
`
`EE UBERFame
`
`ALU INST
`
`ALM
`INST —
`ADDER
`
`
`
`
`
`
`
`
`
`
`THINS
`
`
`cs,
`Son TRAY ARDR
`
`
`
`te TX
`
`
`
`TEX INST
`
`
`
`TSTATE
`
`
`
`CSTORE
`
`
`
`
`
`|
`CONTROL
`
`ALL EST
`
`SP
`
`SP
`
`SP
`
`r
`
`z
`
`i
`
`
`
`
`
`
`
`
`
`
`|-|
`
`
`
`fF}
`
`
`
`+| PCIOB |*| PC/OB >PCIOB
`
`¥
`5
`¥
`ki]
`RB)
`RB
`RB
`RB
`
`Is
`
`Td. at 3.
`
`28.
`
`Following Version 0.6, the R400 architecture was kept in the R400
`
`Sequencer Specification. See, e.g., Ex. 2039, p. 7. A version of the R400
`
`Sequencer Specification dated May 1, 2003 includes the R400 architecture
`
`reproduced below. The general role of the sequencer within this architecture did
`
`not change. And most inputs/outputs to/from the sequencer stay consistent.
`
`-16-
`
`AMD1044_0011573
`
`ATI Ex. 2004
`IPR2023-00922
`Page 18 of 61
`
`ATI Ex. 2004
`
`IPR2023-00922
`Page 18 of 61
`
`
`
`
`
`
`
`CE
`
`BC emninnpp
`
`
`
`
`
`
`
`
`
`INST STORE
`
`
`
`
`
`ALU BANK 0
`
`|
`
`ALU BANK 4
`
`t
`
`t
`
`y
`
`§
`
`
`
`CONSTANTS
`
`
`
`
`
`CSTORE FETGH STATE
`
`RB
`
`tL)
`
`RB
`
`
`
`Lt
`
`
`
`RB
`
`
`
`
`PC/OB
`PC/oB | PC/OB
`PCHOB
`
`
`
`
`
`
`Re
`
`-|
`
`
`
`
`
`Id.
`
`2. R400 Sequencer Specification (Version 2.0): April 19, 2002
`
`29.
`
`The sequencer’s control flow changed in Version 2.0 of the R400
`
`Sequencer Specification. See Ex. 2028, pp. 5, 6, 10. Version 2.0 is dated April 19,
`
`2002. See Ex. 2028, p. 5; see also Ex. 2044 (for the log entry on April 19, 2002).
`
`30. Again, the reason for the change was to meet the requirements of
`
`Microsoft’s API, called DX10. In particular, Microsoft wanted the sequencer to be
`
`able to run shaders with an unlimited number of clauses/instructions. Thefirst
`
`-|7-
`
`AMD1044_0011574
`
`ATI Ex. 2004
`IPR2023-00922
`Page 19 of 61
`
`ATI Ex. 2004
`
`IPR2023-00922
`Page 19 of 61
`
`
`
`version selected from sixteen vertex clauses and sixteen pixel clauses. See Ex.
`
`2010, pp. 5, 14-15. To meet Microsoft’s specifications, | changed the sequencer’s
`
`control flow. The newcontrol flowis shown below.
`
`
`
`
`
`|| InputArbiter a
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`-—|
`
`VIX RS
`
`PIX RS
`
`t—
`
`
`
`
`
`
`
`Ls} Exec Arbiter|
`
`
`
`ALU
`
`-/}—_+———-
`
`Texture
`
`
`
`
`
`
`
`Ex. 2028, p. 10.
`
`31.
`
`In this version, there are two reservation stations, one reservation
`
`station for vertices (VTX RS) and onereservation station for pixels (PIX RS). See
`
`id. The texture threads and the ALU threads are not separated. See id. at 23
`
`(“[Without separate “texture clauses’ and ‘ALU clauses’ we need to know which
`
`instructions to dispatch to the Texture Unit and which to the ALU unit.”); see also
`
`id. at 6 (using the term thread). Each reservation station stores threads at specified
`
`-18-
`
`AMD1044_0011575
`
`ATI Ex. 2004
`IPR2023-00922
`Page 20 of 61
`
`ATI Ex. 2004
`
`IPR2023-00922
`Page 20 of 61
`
`
`
`locations with status bits indicating which engine is needed for execution. See id. at
`
`25 (“A thread lives in a given location in the buffer during its entire life.””), 26
`
`(“Status Bits’ needed include: .. . [a] Texture/ALU engine [identifier|”).
`
`32. An arbiter, labeled as “Exec Arbiter,” selects threads for an ALU
`
`engine and a texture engine. See id. at 25. The arbiter selects a thread based on
`
`FIFO. See id. at pp. 6 (‘The arbitrator will give priority to older threads.”), 25
`
`(“[T]he buffer has FIFO qualities in that the threads leave in the order that they
`
`enter.””). The thread is then read out of a reservation station. See id. at 26. Once the
`
`texture engine or the ALU engine executes the thread, the respective engine returns
`
`the thread to the location from which the thread originated. See id. (“[The thread]
`
`is returned to the buffer(at the same place) with its status updated once all possible
`
`sequential instructions have been executed.”).
`
`33.
`
`‘This version 1s present in later revisions of the Sequencer
`
`Specification. See, e.g., Ex. 2039, p. 9.
`
`Hf. DILIGENCE
`
`34. After conceiving of the design for the R400, my colleagues and I
`
`worked to implement it. This work is shown by at least two things: (1) I
`
`periodically updated the R400 Sequencer Specification; and (2) mycolleagues and
`
`I continuously developed, tested, and debugged emulation code and RTL code for
`
`-19-
`
`AMD1044_0011576
`
`ATI Ex. 2004
`IPR2023-00922
`Page 21 of 61
`
`ATI Ex. 2004
`
`IPR2023-00922
`Page 21 of 61
`
`
`
`the R400, including the other components that supported and interacted with the
`
`SCQquencer.
`
`A. I Periodically Updated the R400 Sequencer Specification
`
`35. During development, the architectural leads for each block wrote
`
`specifications to describe the structure and function of the blocks. See Ex. 2042, p.
`
`1 (showing that specifications were used by designers). I was the architectural lead
`
`for the sequencer block, so I developed the R400 Sequencer Specification.
`
`36.
`
` Ilupdated the R400 Sequencer Specification approximately every two
`
`to three weeks. See Ex. 2039, pp. 4-5 (outlining edits to the document). There are
`
`at least thirty three revisions of this document. See id.; see also Exs. 2007-38. The
`
`revisions span from at least May 25, 2001 to August 29, 2003. See Ex. 2043, p. 2
`
`(first log entry May 25, 2001); Ex. 2044, p. 1 (last log entry August 29, 2003); Ex.
`
`2039, pp. 4-5 (showing 33 versions from May 7, 2001 to May 1, 2003).
`
`37. Having multiple revisions during developmentis typical. Each
`
`revision showed progress during the previous week(s). During interim periods, the
`
`project team worked on outstanding issues. See e.g., Ex. 2018, p. 35 (ending the
`
`document with a section labeled “Open Issues”).
`
`-20-
`
`AMD1044_0011577
`
`ATI Ex. 2004
`IPR2023-00922
`Page 22 of 61
`
`ATI Ex. 2004
`
`IPR2023-00922
`Page 22 of 61
`
`
`
`B. My Colleagues and I Continuously Developed and Debugged Emulation
`Code and RTL Codefor the R400
`
`38.
`
`Inthe beginning of development, only a handful of engineers were
`
`assigned to the R400 project. But by late 2001 or early 2002, ATI assigned over
`
`one hundred project managers/designers to implement and test the R400. During
`
`these two years, many project managers/designers transitioned from other projects
`
`and were assigned to solely work on the R400 project. These project
`
`managers/designers, including me, diligently worked on the R400. In particular,
`
`we used the specifications to write emulation code and RTL code for the R400’s
`
`functional blocks. We then tested the R400’s functional blocks.
`
`39.
`
`Everyone assigned to the R400 project saved their work in a revision-
`
`control system, called Perforce. I understand that Perforce maintains metadata(1.e.,
`
`document logs and folder histories). This metadata includes information such as
`
`the date each file was revised and the user that made each revision.
`
`40.
`
`J understand that this metadata identifies the following users as the
`
`users that worked on the R400 design. These users worked from three of ATI’s
`
`offices: (1) Marlboro, Massachusetts, USA; (2) Orlando, Florida, USA; and (3)
`
`Toronto, Ontario, Canada. See Ex. 2053, p. 6. I recall that periodic meetings
`
`occurred to coordinate efforts betweenthe three offices.
`
`AMD1044_0011578
`
`ATI Ex. 2004
`IPR2023-00922
`Page 23 of 61
`
`ATI Ex. 2004
`
`IPR2023-00922
`Page 23 of 61
`
`
`
`
`
`Address Prefix. | Exhibits __
`Address Prefix |Exhibits = =|
`aashkar
`72050 a 2051, 2052
`abeaudin2050,2052
`|
`desiree
`2049,
`
`
`
`
`alleng -2050,2051=|~~‘|dglen 2050, 2051
`
`
`amys
`2049, 2052
`donaldl
`2049, 2050, 2052
`
`
`ashishs
`2049, 2050, 2052
`dougd
`2049, 2050, 2052
`
`|
`askende
`2048, 2049, 2050,
`dwong
`2050, 2051
`2051, 2052, 2107 |
`
`|
`
`
`
`efong
`2049, 2050, 2052
`
`
`bbloemer
`2050, 2051
`
`oe a enewman
`2050
`
`bbuchner
`2050, 2051, 2052
`
`oe oe
`beiwang
`2050, 2051
`
`fhsien
`2049, 2050, 2052
`
`bhankins
`2049, 2050, 2052
`
`
`fghodrat
`2050
`
`
`fliljero
`
`2050, 2052
`
`frising
`2048, 2050
`
`
`frivas
`2050, 2052
`
`
`gabarca
`2050
`
`
`georgev
`2050, 2052
`
`
`grayc
`2048, 2049, 2050
`
`
`
`
`
`
`
`
`brianf
`2050, 2052
`
`
`bryans
`2050
`
`
`cbrennan
`2050, 2052
`
`
`ccoveney
`2050
`
`
`chammer
`2048, 2052
`
`
`chwang
`2050
`
`
`ctaylor
`
`2048, 2050, 2051,
`2052
`
`
`2050, 2052
`gregs
`
`csampayo 2050,20520| ee
`
`hartogs
`2048, 2049, 2050,
`2051, 2052
`
`
`danh
`
`delifton
`
`2049, 2052
`
`2048, 2049, 2050,
`
`hdong
`2050
`
`
`hwise
`
`2048, 2052
`
`|
`
`imuskatb
`
`2050
`
`-22-
`
`AMD1044_0011579
`
`ATI Ex. 2004
`IPR2023-00922
`Page 24 of 61
`
`ATI Ex. 2004
`
`IPR2023-00922
`Page 24 of 61
`
`
`
`
`
`
`
`
`
`
`
`
`Address Prefix |Exhibits =|
`jacarey
`2050
`~
`jasif
`2050
`
`—
`
`Address Prefix. Exhibits
`kryan
`2050, 2052
`Hchen
`2050
`
`
`jasony(2051| ‘|Tkang
`2050, 2051,
`
`
`jayw 2050, 2051, 2052|Ilefebvr 2048, 2049, 2050,
`
`
`
`|
`|
`2051, 2052
`
`jbrady
`2050
`
`||Iseiler 2048, 2050, 2051,
`
`jcox
`2050
`9052
`
`jennho
`2050
`|
`markf
`2049, 2050, 2052
`
`jhoule
`2048, 2050,2051,,
`[marklee
`7050
`
`
`
`2050, 2051, 2052
`mdoggett
`
`
`2052
`2050, 2052
`a mdesai
`
`jiezhou
`2050
`
`_
`jimmylau
`2050
`
`
`
`_ ||mearl 2048, 2049, 2050,
`jling
`2050
`2052
`
`
`
`
`jmarsano ||mkelly2050 2048, 2050, 2052
`
`
`
`
`jowang ||moev2050, 2051 2049, 2050
`
`jJyarasca
`2050, 2052
`mpersaud
`2050
`
`kcorrell
`2050, 2051
`|
`mmang
`2048, 2049, 2050,
`keli
`2050
`2052
`kevino
`3050, 2052
`2048, 2049, 2050,
`
`
`
`
`
`
`
`|
`
`khabbari
`2050
`
`kmahler
`2048, 2050, 2052
`
`kmeekins
`2049, 2052
`koyu
`2050
`
`
`
`mmantor
`
`2052
`
`
`2050
`mpersaud
`
`4050
`mzhu
`maini
`2048, 2049, 2052
`
`-23-
`
`AMD1044_0011580
`
`ATI Ex. 2004
`IPR2023-00922
`Page 25 of 61
`
`ATI Ex. 2004
`
`IPR2023-00922
`Page 25 of 61
`
`
`
`
`
`
`
`Address Prefix. [Exhibits === |—'|Address Prefix Exhibits
`nbarbier
`2050
`.
`scamlin
`2049, 2050
`
`
`nkociuk= 2050
`|scroce
`205003}
`
`“semara
`2050
`oniuu
`2050
`
`
`omesh
`2050, 2052
`smburu
`2050, 2052
`
`paulv
`2050, 2051, 2052 |
`smorein
`2050, 2051
`
`
`
`pmitchel
`
`2049, 2050, 2051,
`2052, 2107
`
`
`smoOss
`2049, 2050, 2052
`
`
`snezana
`2050
`
`
`2050
`peterp
`2050, 2052
`ee tho
`
`prunstad
`2050
`
`ee
`rbagley
`2050, 2051
`
`
`tien
`2049, 2050, 2052
`
`
`tmartin
`2050, 2052
`
`
`rbeaudin
`
`2048, 2050, 2051,
`9052
`tshah
`2050
`
`rbell
`2049, 2050
`vbhatia
`2050
`
`rfevreau
`2050
`vgoel
`2050, 2052
`
`
`“rfisette
`
`12050
`
`2049, 2052
`viviana
`
`
`
`
`
`
`|
`
`
`
`therrick2” 2050
`viiu
`2048, 2050
`
`rramsey
`2048, 2049, 2050,
`vromaker
`2049, 2050, 2052
`
`2052
`whul
`2050
`rthambim
`2050
`wlawless
`2050, 2051
`tvelez
`2050
`| yeiang
`2048, 2049, 2030,
`
`sallen
`2048, 2049, 2050,
`2052
`
`2051, 2052
`2050
`2050
`
`
`
`
`
`
`
`
`sbagshaw
`
`yvalcour
`
`-24-
`
`AMD1044_0011581
`
`ATI Ex. 2004
`IPR2023-00922
`Page 26 of 61
`
`ATI Ex. 2004
`
`IPR2023-00922
`Page 26 of 61
`
`
`
`4].
`
`In Part V, I analyze this metadata. My analysis showsthat at least one
`
`person on the R400 project team worked on the R400 design every non-holiday
`
`business day from August 24, 2001 (when we conceived of the invention) until
`
`September 29, 2003 (the effective filing date of the °053 patent). See infra Part V.
`
`42.
`
`This metadata is not exhaustive of all sequencer/shader-pipefiles that
`
`were edited during this timeframe. But this metadata shows work that was
`
`necessary for implementing the R400 design. Specifically, this metadata showsthe
`
`design, development, and testing of the R400 sequencer and graphics blocks. See
`
`Exs. 2048, 2049. This metadata also shows work on the design and developmentof
`
`the R400 generally. See Exs. 2050, 2051, 2052, 2107. The design and development
`
`of the R400 was necessary to make progress on the sequencer block and the shader
`
`pipe block; we could not work on or test the sequencer block or the shader pipe
`
`block in isolation.
`
`43.
`
`This analysis is also consistent with my memory ofthe work that we
`
`did on the R400. For me and manyof the other project managers/designers, the
`
`R400 was the only project that most of us were assigned to—it wasour full-time
`
`responsibility. That means, any time that we did work for ATI between late 2001
`
`and the end of 2003, that work would have been on the R400.
`
`AMD1044_0011582
`
`ATI Ex. 2004
`IPR2023-00922
`Page 27 of 61
`
`ATI Ex. 2004
`
`IPR2023-00922
`Page 27 of 61
`
`
`
`TV.
`
`TESTING SHOWED THAT THE RTL IMPLEMENTATION
`WORKEDFORITS INTENDED PURPOSE
`
`44. Weran many tests on the R400 during its development. One test in
`
`particular, the first triangle, showed that a snapshot of the emulation code and the
`
`RTL code workedforits intended purpose—i.e., performing conventional graphics
`
`processing using a unified shader.
`
`45.
`
`For the R400 project, block-level specifications, block-level diagrams,
`
`and interface descriptions provided overarching development concepts. Project
`
`managers/designers on the R400 team used these documents to develop C++ code
`
`(emulation code) and RTL codefor the various blocks of the R400. This code was
`
`tested extensively during the developmentprocess. Tests could be run on both the
`
`emulation code and the RTL code, and these tests could be run on individual
`
`blocks or the entire graphics core.
`
`46.
`
`Circuit designers extensively use circuit simulation to test the GPU
`
`design. After the design has passed extensive testing, the chip design can be
`
`cleared for fabrication. Before sending out the design for fabrication, the RTL code
`
`is converted to a tape-out file, and that tape-out file is sent to a fabrication facility
`
`for fabrication.
`
`47. A product such as the R400 must pass hundredsoftests beforeit is
`
`taped out. Many tests were directed to specific commercial specifications. Other
`
`- 26 -
`
`AMD1044_0011583
`
`ATI Ex. 2004
`IPR2023-00922
`Page 28 of 61
`
`ATI Ex. 2004
`
`IPR2023-00922
`Page 28 of 61
`
`
`
`tests were more generic, such as tests for validation and proof of concept. One
`
`generic test wasthefirst triangle.
`
`48.
`
`Thefirst triangle tested whether the RTL implementation of the GPU
`
`could process a Gouraud shadedtriangle. At this stage of development, this test
`
`involved many different blocks—including the sequencer, the s