throbber
Vi
`® ‘
`
`ORIGINATE DATE
`7 May, 2001
`
`EDIT DATE
`4 September, 2015
`
`DOCUMENT-REV. NUM.
`GEN-CXXXXX-REVA
`
`PAGE
`1 of 9
`
`Author:
`
`Laurent Lefebvre
`
`R400 Sequencer Specification
`
`SEQ
`
`Version 0.1
`
`Overview: This is an architectural specification for the R400 Sequencer block (SEQ). It provides an overview of the
`required capabilities and expected uses of the block.
`It also describes the block interfaces, internal sub-
`blocks, and providesinternal state diagrams.
`
`transmitted in any form or by any means withoutthe prior written permission of AT] Technologies Inc.”
`
`AUTOMATICALLY UPDATEDFIELDS:
`Document Location:
`D:\Perforce\r400\arch\doc\gfx\MC\R400 MemCti.doc
`Current Intranet Search Title:
`R400 Memory Controller Architectural Specification
`APPROVALS
`
`NFORMATION THAT COULD BE
`THIS DOCUMENT CONTAINS
`SUBSTANTIALLY DETRIMENTAL TO THE INTEREST OF ATI TECHNOLOGIES
`INC. THROUGH UNAUTHORIZED USE OR DISCLOSURE.
`
`“Copyright 2001, AT! Technologies Inc. All rights reserved. The material in this document constitutes an unpublished
`work created in 2001. The use of this copyright notice is intended to provide notice that ATI owns a copyrightin this
`unpublished work. The copyright notice is not an admission that publication has occurred. This work contains
`confidential, proprietary information and trade secrets of ATI. No part of this document may be used, reproduced, or
`
`Exhibit 2007.doc
`
`9252 Bytes*** ©
`
`Reference Copyright Notice on Cover Page © ***04:15 o4:03 oy stl 2007
`LGy. ATI
`IPR2015-00325
`
`AMD1044_0256664
`
`ATI Ex. 2104
`IPR2023-00922
`Page 1 of 9
`
`

`

`Vat
`ha
`|
`oe
`
`P
`
`
`
`ORIGINATE DATE
`
`
`7 May, 2001
`
`EDIT DATE
`4 September, 2015
`
`
`
`R400 Memory Controller
`ificati
`Architectural Specification
`
`PAGE
`2 of 9
`
`Table Of Contents
`
`OVERVIEW cc ccccssessseenesesseaenenes 3
`1.
`Top Level Block Diagram ..........0...6. 4
`L.1
`TEXTURE ARBITRATION................. 7
`2.
`ALU ARBITRATION uu... cccesscsceeseens 8
`3.
`INPUT INTERFACE... .cccccesseeeeene 8
`4.
`Rasterizer to Regisiter File (interpolated
`4.1
`data) 8
`42 Texture Unit to Register File (texture
`PQLUIT) oot tee ee te ee teeter enter cteeeeneceneeeees 8
`
`ALU Unit to Register File (ALU op
`4.3
`PESUIE) o.oo ccc cece eee ceeeeeceeeeeesenevenecsereens 8
`44
`Scalar Unit to Register File (Scalar op
`POSUI) oo eect ree tee ct tettetentenrenneeen 8
`5.
`OUTPUT INTERFACE. ......sccsssenes 8
`5.1
`Sequencer to Shader Engine Bus. ....... 8
`5.2
`Shader Engine to Texture Unit Bus... 9
`6
`OPEN ISSUES cc esscsessenesseeseseenrees 9
`
`Revision Changes:
`
`Rev 0.1 (Laurent Lefebvre)
`Date: May 7, 2001
`
`First draft.
`
`Exhibit2007.doc
`
`9262 Bytes*** © ATI Confidential. Reference Copyright Notice on Cover Page © ***pyo415 ¢4-03 pm
`
`AMD1044_0256665
`
`ATI Ex. 2104
`IPR2023-00922
`Page 2 of 9
`
`ATI Ex. 2104
`
`IPR2023-00922
`Page 2 of 9
`
`

`

`
`
`7 May, 2001
`
`4 September, 2015
`
`GEN-CXXXXX-REVA
`
`A BONY BEY BE OY AY SERLE RBA BY
`
`
`
`ORIGINATE DATE
`
`EDIT DATE
`
`DOCUMENT-REV. NUM.
`
`PAGE
`
`3 of 9
`
`1. Overview
`
`The sequencerfirst arbitrates between vectors of 16 vertices that arrive directly from primitive assembly and vectors
`of 8 quads (2 pixels) that are generated in the raster engine.
`
`The vertex or pixel program specifies how many GPR’s it needs to execute. The sequencer will not start the next
`vector until the needed space is available.
`
`The sequencer is based on the R300 design. It chooses an ALU clause and a texture clause to execute, and execute
`all of the instructions in a clause before looking for a newclause of the same type. Each vector will have eight texture
`and eight alu clauses, but clauses do not need to contain instructions. A vector of pixels or vertices ping-pongs along
`the sequencer FIFO, bouncing from texture reservation station to alu reservation station. A FIFO exists between each
`reservation stage, holding up vectors until the vector currently occupying a reservation station has left. A vector at a
`reservation station can be chosen to execute. The sequencer looks at ail eight alu reservation stations to choose an
`alu clause to execute and all eight texture stations to choose a texture clause to execute. The arbitrator will give
`priority to clauses/reservation stations closer to the top of the pipeline.
`it will not execute an alu clause until the
`texture fetchesinitiated by the previous texture clause have completed.
`
`To support the shaderpipe the raster engine also contains the shader instruction cache and constantstore.
`
`Exhibit 2007.dec
`
`9252 Bytes*** © ATI Confidential. Reference Copyright Notice on Cover Page © ***og4/45 04-03 py
`
`AMD1044_0256666
`
`ATI Ex. 2104
`IPR2023-00922
`Page 3 of 9
`
`ATI Ex. 2104
`
`IPR2023-00922
`Page 3 of 9
`
`

`

`BAMA Be Be VEY Sheee AVE RB BERL RY
`
`ORIGINATE DATE
`
`EDIT DATE
`
` 7 May, 2001
`R400 Memory Controller
`Architectural Specification
`4 of 9
`
`4 September, 2015
`1.1 Top Level Block
`Diagram
`
`PAGE
`
`vertex/pixel vectorarbitrator
`
`Possible delay for available GPR’s
`
`‘exture clause 0
`eservation station
`
`
`
`
`
`eservationstation
`
`reservation station
`
`‘exture clause 4
`
`
`eservation station
`
`U clause 4
`
`
`reservationstation
`
`
`exture clause S
`
`eservalion slalion
`
`ALUclause 5
`jeservationstation
`
`
`
`‘exture clause 6
`reservation station
`
`
`
` ‘exture clause 3
`‘exture clause 1
`
`
`eservation station
`
`
`ALUclause 1
`exture arbitrator
`
`
`
`eservation station
`
`
`‘exture clause 2
`exture arbitrator
`
`
`
`reservation station
`
`
`
`
`
`
`
`
`
`
`
`exture clause 7
`
`
`eservation station
`
`
`eservation station
`
`
`The rasterizer always checks the vertices FIFO first and if allowed by the sequencer sends the data to the shader. If
`the vertex FIFO is empty then, the rasterizer takes the first entry of the pixel FIFO (a vector of 32 pixels) and sends it
`to the interpolators. Then the sequencer takes control of the packet.
`
`On receipt of a packet, the input state machine (notpictured but just before the first FIFO) allocated enough spacein
`the registers to store the interpolatoted values and temporaries. Following this, the input state machine stacks the
`packetin the first FIFO.
`
`On receipt of a command, the level 0 texture machine issues a texure request and corresponding register address for
`the texture address (ta). A small command (tcmd) is passed to the texture system identifying the current level number
`
`Exhibit 2007.doc
`
`9252 Byes*** © ATI Confidential. Reference Copyright Notice on Cover Page © ***ogo4i5 04-03 py
`
`AMD1044_0256667
`
`ATI Ex. 2104
`IPR2023-00922
`Page 4 of 9
`
`ATI Ex. 2104
`
`IPR2023-00922
`Page 4 of 9
`
`

`

`BARN ABe RRLV RR
`
`
`
`
`ORIGINATE DATE
`
`
`
`EDIT DATE
`
`DOCUMENT-REV. NUM.
`
`
`
`
`7 May, 2001
`4 September, 2015
`GEN-CXXXXX-REVA
`(0) as well as the register set being used. One texture request is sent every 4 clocks causing the texturing of four
`2x2s worth of data.
`
`Uppon recept of the return data (identified by the temd containing the level number 0), the level 0 texture machine
`issues a register address for the return value (td). Then, it puts the finished packet in FIFO 1.
`
`On receipt af a command, the level 0 ALU machine issues a complete set of level 0 shader instructions. For each
`instruction,
`the state machine generates 3 source addresses, one destination address (2 cycles later) and an
`instruction id wich is used to index into the instruction store. Once the last instruction as been issued, the packetis
`put into FIFO 2. Note that in the case of a pixel packet, the two vectors of 16 pixels are interleaved in order to hide the
`latency of the ALUs (8 cycles).
`
`Ail other level process in the same way until the packetfinally reaches the last ALU machine (8). On completion of the
`level 8 ALU clause, a valid bit is sent to the Render Backend wich picks up the color data. This requires that the last
`instruction writes to the output register — a condition that is almost always true.
`If the packet was a vertex packet,
`instead of sending the valid bit to the RB, it is sent to the PA, which picks up the data a putsit into the vertex store.
`
`Only one ALU state machine may have access to the SRAM address bus or the instruction decode bus at one time.
`Similarly, only one texture state machine may have access to the SRAM address bus at one time. Arbitration is
`performed by two arbitrer blocks (one for the ALU state machines and one for the texture state machines). The
`arbitrers always favor the higher number state machines, preventing a bunch of half finished jobs from clogging up
`the SRAMS.
`
`Each state machine maintains an address pointer specifying where the 16 (or 32) entries vector is located in the
`SRAM (the texture machine has two pointers one for the read address and one for the write). Upon completion of its
`job, the address pointer is incremented by a predefined amount equal to the total number of registers required by the
`shading code. A comparison of the address pointer for the first state machine in the chain (the input state machine),
`and the last machine in the chain (the level 8 ALU machine), gives an indication of how much unallocated SRAM
`memory is available. When this numberfalls below a preset watermark, the input state machine will stall the rasterizer
`preventing new data from entering the chain.
`
`Exhibit 2007.dec
`
`9252 Bytes*** © ATI Confidential. Reference Copyright Notice on Cover Page © ***cg4j5 04-03 py
`
`AMD1044_0256668
`
`ATI Ex. 2104
`IPR2023-00922
`Page 5of 9
`
`ATI Ex. 2104
`
`IPR2023-00922
`Page 5 of 9
`
`

`

`BAMA Bee BY by RRRER LVRS RO EVENEE BEY
`
`R400 Memory Controller
`EDIT DATE
`ORIGINATE DATE
`
`7 May, 2001 Architectural Specification 4 September, 2015
`
`6 of 9
`
`PAGE
`
`
`
`“|datafromRE
`
`RegisterFile
`512x128 (built as 4 128x128 oF 16 128x32
`
`control from RE
`
`
`
`or vertex parameter data to RE through texture block
`or pixel data to RB through toxture block
`
`532
`
`ngbit data —_——— constants fromRE
`_o
`
` Addressto texure
`
`wrt 4 32 bit}
`
`
`
`
`(perand mux
`Weyl yey
`
`AC units
`
`128 bil scalar/vector
`ALU
`
`
`control from RE
`
`Exhibit 2007.doc
`
`9252 Bytes*** © ATI Confidential. Reference Copyright Notice on Cover Page © ***59.44/45 04-03 pm
`
`AMD1044_0256669
`
`ATI Ex. 2104
`IPR2023-00922
`Page 6 of 9
`
`ATI Ex. 2104
`
`IPR2023-00922
`Page 6 of 9
`
`

`

`BUA RR RE RELV Rk AA
`
`ORIGINATE DATE
`
`EDIT DATE
`
`DOCUMENT-REV. NUM.
`
`7 May, 2001
`
`4 September, 2015
`
`GEN-CXXXXX-REVA
`
`
`
`
`
`-)
`CORSTEHIS Tom RE-
`
`_ Register File
`pipeline stage
`
`| p
`
`o
`instruction
`
`;
`,
`Register File
`
`MAC
`
`,
`
`
`
`_
`data from R
`EX{ureTeiGhTet
`
`addresstotexture
`constants from
`
`|
`~~
`cipeline stage
`
`| |
`
`|
`‘
`
`|
`|
`
`
`|
`
`| instruction
`
`m
`vipeline stage
`
`.
`;
`Register File
`
`data from RE
`exture fetch
`retum
`
`address
`
`totexture_)
`
`AC
`
`= Cl
`
`.
`
`|
`
`;
`
`|
`
`Scalar Unit
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`8)Q
`>
`[a
`
`constants from RE
`~
`
`-
`|
`struction
`data fromRE
`,
`5
`pgister File
`S
`.
`exturé fetch
`return
`1 1
`
`
`
`&
`
`pipeline stage [=ieSta I| I poe
`
`
`
`
`
`c
`
`constants from R
`IS
`2
`i|le
`io Heb
`ie! 2
`
`I
`o|
`|
`2
`I

`
`
`
`MAC
`
`
`
`freon
`
`scalar operand input/ scalar result output
`
`addresstotexture
`
`2. Texture Arbitration
`
`The texture arbitration logic chooses one of the 8 potentially pending texture clauses to be executed. The choice is
`made by looking at the fifos from 8 to 0 and picking the first one ready to execute. Once chosen, the clause state
`machine will send one 2x2 texture fetch per 4 clocks until all the texture fetch instructions of the clause are sent. This
`means that there cannot be any dependencies between two texture fetches of the same clause.
`
`Exhibit 2007.dec
`
`9252 Bytes*** © ATI Confidential. Reference Copyright Notice on Cover Page © ***og4/45 04-03 py
`
`AMD1044_0256670
`
`ATI Ex. 2104
`IPR2023-00922
`Page 7 of 9
`
`ATI Ex. 2104
`
`IPR2023-00922
`Page 7 of 9
`
`

`

`
`
`A AUN BEY Bk Yay RAREEER AVES RE BYRNE REY
`
`ORIGINATE DATE
`EDIT DATE
`R400 Memory Controller
`|
`PAGE
`Architectural Specification
`7 May, 2001
`4 September, 2015
`8 of 9
`
`4
`
`bd ]
`i
`
`3. ALU Arbitration
`
`ALU arbitration proceeds in almost the same way than texture arbitration. The ALU arbitration logic chooses one of
`the & potentially pending ALU clauses to be executed. The choice is made by looking at the fifos from 8 to 0 and
`picking the first one ready to execute.
`If the packet chosen is a packet of vertices, the state machine issues one
`instruction every 4 clocks until the clause is finished. This means that the compiler has to insert nops between two
`dependent successive instructions. If the packet is a pixel packet it is made out of two sub-vectors of 16. Thus the
`state machine issues the first instruction for the first sub-vector and then, 4 clocks later, the first instruction of the
`second sub-vector and so on until the clause is finished. Proceeding this way hides the latency of 8 clocks of the
`ALUs.
`
`4. Input Interface
`
`4.1 Rasterizer to Register File (interpolated data)
`
`
`
`‘Name_
`Direction|bits|Description rr
`
`
`SND
`SEQ DSP
`_|1
`High when sending data
`
`
`
`interpolated data SEQ DSP|512 512 bits transferred every 4 cycles
`
`42 Texture Unit to Register File (texture return)
`
` Name
`
`Direction|bits Description
`
`
`
`SND SEQSTU|1 High when sending data
`
`Texture colors
`TU>SP
`512
`512 bits transferred every 4 cycles
`
`4.3 ALU Unit to Register File (ALU op result)
`
` Name
`
`
`
`Direction|bits Description
`
`
`
`SND SEQSSP|1 High when sending cata
`Blend result ALU
`SP>SP
`512
`512 bits transferred every 4 cycles
`
`Write Mask The four write masks SP3SP 16
`
`
`
`44 Scalar Unit to Register File (Scalar op result)
`
`
`Direction|bits Description
`
`
`
`SND SEQ DSP|1 High when sending data
`Scalar result
`SP SP
`512
`512 bits transferred every 4 cycles
`Write Mask
`SPSP
`16
`The four write masks
`
` Name
`
`
`
`
`
`5. Qutput Interface
`
`5.1 Sequencer to Shader Engine Bus
`This is a bus that sends the instruction and constant data to all 4 Sub-Engines of the Shader. Because a new
`instruction is needed only every 4 clocks, the width of the bus is divided by 4 and both constants and instruction
`are sent over those 4 clocks.
`
`
`
`
`
`Narne Bits|Description| Direction |
`
`_SEQ-> SP
`1
`High on first cycle of transfer
`Instruction Start
`
`Constant 0
`| SEQ-> SP
`32
`128 bits transferred over 4 cycles, alpha first...blue last
`Constant 4
`| SEQ-> SP
`32
`128 bits transferred over 4 cycies, alpha first...blue last
`instruction
`| SEQ-> SP
`40
`160 bits transferred over 4 cycles
`
`
`
`
`
`
`Exhibit 2007.deo
`
`9252 Bytes*** © ATI Confidential. Reference Copyright Notice on Cover Page © ***pg,4j45 94-03 pm
`
`AMD1044_0256671
`
`ATI Ex. 2104
`IPR2023-00922
`Page 8 of 9
`
`ATI Ex. 2104
`
`IPR2023-00922
`Page 8 of 9
`
`

`

`BRAN RBI BE OY BY AAREe LE RR BERL REY
`
`
`ORIGINATE DATE
`
`EDIT DATE
`
`
`
`DOCUMENT-REV. NUM.
`
`
`
` PAGE
`
`
`
`7 May, 2001
`4 September, 2015
`5.2 Shader Engine to Texture Unit Bus
`One quad’s worth of addresses is transferred to Texture Unit every clock. These are sourced fro a different pixel
`within each of the sub-engines repeating every 4 clocks. The register file index to read must precede the data by
`2 Clocks. The Read address associated with Quad 0 must be sent 1 clock after the Instruction Start signal is sent,
`so that data is read 3 clocks after the Instruction Start.
`
`GEN-CXXXXX-REVA
`
`9 of 9
`
`One Quad’s worth of Texture Data may be written to the Register File every clock. These are directed to a
`different pixel of the sub-engines repeating every 4 clocks. The register file index to write must accompany the
`data. Data and Index associated with the Quad 0 must be sent 3 clocks after the Instruction Start signal is sent.
` on
`
`Bits |Description
`Name
`|Direction
`|
`
`| Tex_Read_Register_Inde|SEQ->SP 8 index into Register Files for reading Texture Address
`
`xX
`
`
`
`
`| Tex_RegFile_Read_Data|SP->TEX 5i2|4 Texture Addresses read from the Register File
`
` Data
`
`| Tex_Write_Register_Index|SEQ->SP 8 index into Register file for write of returned Texture
`
`
`6. Open issues
`There is currently an issue with constants. If the constants are not the same for the whole vecior of vertices, we don’t
`have the bandwith from the texture store to feed the ALUs. Two solutions exists for this problem:
`1) Let the compiler handle the case and put those instructions in a texture clause so we can use the
`bandwith there to operate. This requires a significant amount of temporary storeage in the register store.
`2) Waterfall down the pipe allowing only at a given time the vertices having the same constants to operate in
`parralel. This might in the worst case siow us downbya factor of 16.
`
`Exhibit 2007.doc
`
`9252 Bytes*** © ATI Confidential. Reference Copyright Notice on Cover Page © ***po,4/45 4-03 pm
`
`AMD1044_0256672
`
`ATI Ex. 2104
`IPR2023-00922
`Page 9 of 9
`
`ATI Ex. 2104
`
`IPR2023-00922
`Page 9 of 9
`
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket