`
`EDIT DATE
`
`Author: Clay Taylor
`
`29 October, 2001
`
`[date \@ "d MMMM,
`_
`
`DOCUMENT-REV. NUM.
`
`R400 Generic Spec
`
`R400 Primitive Assembly
`
`Block Overview VGT/CL/SU/SC
`
`Overview:
`
`ver 0.1
`
`form or by any meanswithout the prior written permission of ATI TechnologiesInc.”
`
`AUTOMATICALLY UPDATED FIELDS:
`DocumentLocation:
`Document
`Current Intranet SearchTitle :
`R400 Generic Spec
`APPROVALS
`
`Name/Dept
`
`Signature/Date
`
`Remarks:
`
`INFORMATION THAT COULD BE
`THIS DOCUMENT CONTAINS
`SUBSTANTIALLY DETRIMENTAL TO THE INTEREST OF ATI TECHNOLOGIES
`INC. THROUGH UNAUTHORIZED USE OR DISCLOSURE.
`
`“Copyright 2000, ATI Technologies Inc. All rights reserved. The material in this document constitutes an unpublished
`work created in 2000. The use of this copyright notice is intended to provide notice that ATI owns a copyright in this
`unpublished work. The copyright notice is not an admission that publication has occurred. This work contains confidential,
`proprietary information and trade secrets of ATI. No part of this document maybe used, reproduced, or transmitted in any
`
`
`
`R400_Orlando_Top.doc17198 Bytes*** © Reference Copyright Notice on Cover Page © *** 41/01/01 11:57 AM
`
`
`
`AMD1044_0182535
`
`ATI Ex. 2038
`IPR2023-00922
`Page 1 of 8
`
`ATI Ex. 2038
`
`IPR2023-00922
`Page 1 of 8
`
`
`
` Vat
`ORIGINATE DATE
`EDIT DATE
`DOCUMENT-REV. NUM.
`PAGE
`29 October, 2001
`[date \@ "d MMMM,
`R400 Generic Spec
`2 of 8
`
`we i
`
`Table Of Contents
`
`QVRWW vis isanisciistciictstecketctcnsccinicnrninninmanninn inate 6
`1,
`1.1 PRS sicsicscsisceasrecnsectssssiea stats iat SUAS GBS SSN Cl GTO SPEERaE 6
`12
`Bl:ASU (ho:eeeeeea 6
`2;
`‘EXTERNAL INTERFAGESS scccscscccccsssaccriavssnpcvsnccasvoncasncsveceusscaucscescwicasieveniaaveccuvaueteviaudivisianies 6
`Bik
`PAE TGOise isccscoernccexeanexcosuss eepanmncn eannceansectememunTKaT sis as RN AREER RR ENNRIDCERN ESTEE 6
`Bi
`BLOCK DIAGRAG oascnesmucentscasianwiianmnie i.
`4,
`T ssiiciacatiaicsnitcieiNiCARNT NANG ERROR NN TRRIAL ERR TRRweaaNNNTRNCNCRImnCRieiiiauaaNtiNs 7
`Sy rence ieeeeeeesteem 7
`5.1
`Description of each SUDDIOCK..... 2... cece cece cece cece eee ee cece cece cette eee ettetteereeereeees 8
`6.
`REGISTER SPECIFICATIONwssssaievecisnessansescassavansanncassenacaveceasasaxecanwsnasscsnarsnnanieasventesanvoamenarvs 8
`6.1
`PGRONINGENAOOBIE beacons craessnsmecxncrmacnxaneancmamen sn gran nna mATEAERe CUS RE NACRCORS ROE CRANE 8
`he Pea Oe SIN apace arwcecaseeccees ese ca ras AUS ASSES EU TASTIT TA NASOSURRNSE GSATI 8
`8
`AREA ESTIMAIG sscisteesiinsiisniinvanusmusciianmonmiiiaanimaitauememvemenssios 8
`9.
`PERFORMANCE ISSUES. ............c:cccscsesesecsssssecseeeessseceseeesseseeeeeeeseeeeeseeseeueeseesenseeesaseeeesneeeas 8
`
`
`
`R400_Orlando_Top.doc11")7198 Bytes*** © eference Copyright Notice on Cover Page © **+ 41/01/01 11:57 AM
`
`
`
`
`
`AMD1044_0182536
`
`ATI Ex. 2038
`IPR2023-00922
`Page 2 of 8
`
`ATI Ex. 2038
`
`IPR2023-00922
`Page 2 of 8
`
`
`
`ORIGINATE DATE
`
`EDIT DATE
`
`[date \@ "d MMMM,
`
`R400 Generic Spec
`
`DOCUMENT-REV. NUM. 29 October, 2001
`
`Revision Changes:
`
`Rev 0.0 (Clay Taylor)
`Date: October 29,2001
`Initial revision.
`
`Documentstarted
`
`
`
`R400_Orlando_Top.doc17198 Bytes*** © Reference Copyright Notice on Cover Page © *** 41/01/01 11:57 AM
`
`
`
`AMD1044_0182537
`
`ATI Ex. 2038
`IPR2023-00922
`Page 3 of 8
`
`ATI Ex. 2038
`
`IPR2023-00922
`Page 3 of 8
`
`
`
`R400 Generic Spec
`
`ORIGINATE DATE
`
`EDIT DATE
`
`DOCUMENT-REV. NUM.
`
`29 October, 2001
`
`[date \@ "d MMMM,
`
`Introduction
`
`This documentwill briefly describe at a top level the interfaces, requirements and motivations for the Primitive Assembly group
`of blocks of the R400 chip. The primarygoal of this documentis to provideafirst-ticr description of the blocks whichare being
`designed bythe Orlandosite and their relative location within the graphics pipeline.
`
`The acronym PA (Primitive Assembly) is being used to encompass manyof the Orlando-owned blocks. The individual blocks
`will be identified as follows:
`
`1. Command Processor (CP). (Not considered to be part of PA)
`
`2.
`
`Vertex Grouper Tesselator (VGT).
`
`3. Clipper & Viewport Transform(CL).
`
`4.
`
`5.
`
`Setup Engine (SU).
`
`Scan Converter Barycentric Interpolator (SC)
`
`There will be a separate spec for cach of these blocks which will be maintained by the block owners. The details of the
`functionality, internal interfaces, ete can be found in the individual specifications.
`1.1 Requirements & Functional Descriptions
`The R400 3D pipeis similar in function to previous ATI designsin that it takes in primitives (typically triangles),
`performs vertex processing (Index Reuse Detection, Vertex Fetching, Transform, Lighting, PVS, Viewport Xform), performs
`primitive processing (Clipping and Setup), rasterizes primitives into pixels (Scan Conversion and Perspective Correct
`Barycentric Coord Calculation), performs pixel processing (Texture Lookup, Pixel Shaders, Alpha Blend/Test, Z Test,
`Stencil, etc).
`In ALU architectureit is most similar to R300 in that it uses Vertex Shaders and Pixel Shaders only for Vertex
`and Pixel processing, but varies significantly from R300 in pipeline architecture. Because of this, there may besignificant
`reuse of R300 logic for several blocks in the design (i.e. setup, viewport transform, barycentric interpolation), but there is
`very little commonality in block-level interfaces, etc.
`Many of the requirements for the CP/PA blocks are quite similar to the equivalent blocks’ requirements for
`R300. The top-level requirements of the blocks are as follows (See individual specsfor real details):
`
`CommandProcessor
`
`It’s primary role is to fetch command stream data (which includes state updates,
`The CP has manyfunctions.
`draw packets, 2D packets, IDCT packets, etc) and deliver it to the various blocks in the chip. Please see the CP spec for
`details of requirements, functionality, interfaces, etc.
`
`Vertex Grouper / Tesselator
`This block has twodistinct functions, vertex reuse/grouping and tesselation.
`The first, and most frequently used is the vertex reuse/grouping. The functionality is very similar to the
`R100/R200/R300 vertex reuse block. The requirementis that the block detect vertex index reuse within the previous 16 (or
`14 or 15 TBD) vertex indices.
`If a hit is detected, this vertex is not resubmitted for vertex processing. The vertex grouper
`will gather together 64 (32 or 16 for RV400, RL400) vertices for submission to the unified shader (vertex shader). This
`group of 64 vertices is processed as one “unit” in the shader pipeline. Partial units will be submitted when an end of packet
`is detected. This block will send primitives to the CLIP block to perform clipping, etc on the post-shaded vertices.
`Tesselation is not well defined currently. This block will be used to tessellate primitives and create indices
`and/or weights to assist in the tessellation process. The base requirements are that at least discrete and continuous N-
`Patch tessellation be supported. Many methods have been discussed, but no detailed requirements have been formulated.
`Thereis also an undefined requirementthat the tessellation engine be programmable.
`
`Clipper & Viewport Transform
`This block has twodistinct functions, clipping and viewport transform.
`The clipping process is similar to that of R100,R200,R300 with the notable exception that only position and
`barycentric weights are clipped/created, none ofthe triangle (vertex) attributes are clipped.
`The viewport transform process is very similar to previous generations, with the possible exception that
`viewport transform may be performedpriorto clipping.
`
`Setup Engine
`
`
`
`R400_Orlando_Top.doc11")7198 Bytes*** ¢ Reference Copyright Notice on Cover Page © *** 41/01/01 11:57 AM
`
`
`
`
`
`AMD1044_0182538
`
`ATI Ex. 2038
`IPR2023-00922
`Page 4 of 8
`
`ATI Ex. 2038
`
`IPR2023-00922
`Page 4 of 8
`
`
`
`DOCUMENT-REV. NUM. 29 October, 2001
`
`ORIGINATE DATE
`
`EDIT DATE
`
`[date \@ "d MMMM,
`
`R400 Generic Spec
`
`The requirements are very similar to that of the R300 setup engine with the notable exception (again) that none of
`the primitive (vertex) attribute data other than position (x,y,z,w) and barycentric weights are processed in the setup engine.
`The fact that the clipper no longer clips attribute data requires that the setup engine be able to compute barycentric weight
`gradients for triangles with non-0/non-1 barycentric values at the vertices (probably at a reduced rate). This block is also
`responsible for computing start-points and gradients for Z, and the barycentric weights (I,J,K).
`It is also responsible for
`back-face culling, scissor-culling, fill-mode conversion (tri->point, tri->wireframe), polygon (Z) offset, and line-stipple texture
`coordinate calculations.
`
`Scan Converter / Barycentric Interpolator
`The functional requirements of the R400 scan converter are similar to that of R300 except that the performance
`andtiling requirements are different due to the single SC of R400 versus the multi-SC of R300. The scan converter must
`determine which 8x8 pixel “tiles” are “hit” by a primitive and submit these tiles to the Render Backend for hierarchical z
`culling.
`It must determine which 2x2 pixel “quads” are “hit” by a primitive for submission to the pixel shaderpipeline. These
`quads must be assembled into a group of 16 quads (64 pixels) in an order which is advantageous to texture caching (TBD).
`It must produce 2 quads per clock (8 pixels per clock) in order to keep up with the render backend desired rate of 8 pixels
`per clock. The SC must support points, lines, triangles, and rects, line stipple, poly stipple, line antialiasing, and polygon
`antialiasing?.
`It must support walking algorithms advantageousto 3D as well as 2D overlapping blits. It must conform tofill
`rules for both DX and OGL.
`The SC (BI) must also compute the perspective-correct barycentric weights for each of the pixels that it outputs to
`the pixel shader (sequencer).
`
`
`
`R400_Orlando_Top.doc17198 Bytes*** © Reference Copyright Notice on Cover Page © *** 41/01/01 11:57 AM
`
`
`
`AMD1044_0182539
`
`ATI Ex. 2038
`IPR2023-00922
`Page 5 of 8
`
`ATI Ex. 2038
`
`IPR2023-00922
`Page 5 of 8
`
`
`
`DOCUMENT-REV. NUM. 29 October, 2001
`
`[date \@ "d MMMM,
`
`R400 Generic Spec
`
`ORIGINATE DATE
`
`EDIT DATE
`
`2. Overview
`
`2.1 Features
`
`The features which affect these blocks will be discussed in the individual specs.
`
`2.2 Performance
`
`The only performance goal worth mentioning in this high-level doc is the peak polygon rate for the VGT, CLIP,
`SU, SC.
`It is worth mentioning since it is a reduction in rate from the R300 with the intent of saving area
`relative to the R300. The R400 peak polygonrate goal is 1 triangle per clock through VGT, CLIP and back-
`face culling (may include small-face culling) of the SU and 1 triangle every two clocks for the rest of the
`pipeline (setup gradient calculations, scan conversion, etc).
`
`3. External Interfaces
`
`3.1 An interface
`
`Interfaces will be defined in the block-level specs.
`
`
`
`
`
`R400_Orlando_Top.doc11")7198 Bytes*** © , *eference Copyright Notice on Cover Page © *** 41/01/01 11:57 AM
`
`
`
`
`
`AMD1044_0182540
`
`ATI Ex. 2038
`IPR2023-00922
`Page6 of 8
`
`ATI Ex. 2038
`
`IPR2023-00922
`Page 6 of 8
`
`
`
`DOCUMENT-REV. NUM. 29 October, 2001
`
`ORIGINATE DATE
`
`EDIT DATE
`
`[date \@ "d MMMM,
`
`R400 Generic Spec
`
`Block Diagram
`|
`R400 Top Level 3D Block Diagram
`
`COMMAND PROCESSOR
`
`RBEM
`
`(1) ACTIVE STATE CONTEXT VECTOR
`(2) ORM TYPE (LIeT/PAR/eTRID)
`(3)
`INDEX LIST
`
` (1)
`(2)
`
`
`SHADER
`SEQUENCER
`
`
`
`
`STATE CONTEXT VECTOR
`COMMAND TO INITIATE
`VERTEX PROCESSING
`
`VERTEX GROUPER
`TESSELLATOR
`
`:
`
`|
`
`|
`
`
`
`
`
`
`
`
`STATE STORAGE FOR
`VGT, CLIP, SETUP, AND
`SCAN CONVERTER
`
`
`
`
`
`
`VS1 INPUT
`
`VERTEX
`i
`DATA
`+
`; ARRAYS
`“(IN LM)
`
`BLOCKS / FUNCTIONS
`~ 7
`BELOW THIS LISE
`DS ARE NOT INVOLVED
`
`
`
`(ONE PER SHADER)
`|
`VERTEX INDEX
`
`
`
`(1)
`STATE CONTEXT VECTOR
`(2)
`PRIM ASSEMBLY DATA
`
`
` VSO INPUT
`
`
`
`
`VERTEXPOSITIONCACHE
`
`
`]
`
`|
`
`, CCGEN
`CLIPPER
`
`IM VGT DESCRIETION
`
`VERTEX PARAMETER CACHE
`
`INTERPOLATOR
`
`
`
`PIXELSHADER
`
`|
`
`FRAME BUFFER (\N LM)
`
`4.2?
`Put sections in here for additonal data that is commontoall of the subblocks.
`Examples: texture formats,
`
`5. Logic description
`
`
`
`R400_Orlando_Top.doc111)7198 Bytes*** © %eference Copyright Notice on Cover Page © *** 41/01/01 11:57 AM
`
`
`
`
`
`AMD1044_0182541
`
`ATI Ex. 2038
`IPR2023-00922
`Page7 of 8
`
`ATI Ex. 2038
`
`IPR2023-00922
`Page 7 of 8
`
`
`
` rst
`ORIGINATE DATE
`EDIT DATE
`DOCUMENT-REV. NUM.
`PAGE
`we i
`29 October, 2001
`[date \@ "d MMMM,
`R400 Generic Spec
`8 of 8
`5.1 Description of each subblock
`
`6. Register Specification
`
`6.1 Performance/Debug
`
`7. Physical Design
`
`8. Area Estimate
`
`9. Performance issues
`
`
`
`R400_Orlando_Top.doc11')7198 Bytes*** © 4 Reference Copyright Notice on Cover Page © *** 41/01/01 11:57 AM
`
`
`
`
`
`AMD1044_0182542
`
`ATI Ex. 2038
`IPR2023-00922
`Page 8 of 8
`
`ATI Ex. 2038
`
`IPR2023-00922
`Page 8 of 8
`
`